mirror of
https://github.com/NetherlandsForensicInstitute/hansken-extraction-plugin-sdk-documentation.git
synced 2026-02-14 14:09:49 +00:00
163 lines
5.1 KiB
Plaintext
163 lines
5.1 KiB
Plaintext
# How to debug an Extraction Plugin
|
|
|
|
Debugging is the art of removing bugs — hopefully quickly.
|
|
|
|
## Locally
|
|
|
|
To debug a plugin locally, it is recommended to start the plugin via the IDE. This has the advantage that breakpoints
|
|
can easily be put in the code instead of printing log statements, for example. To start a plugin locally, a piece of
|
|
code must be added, see [Testing](testing.md) for more information.
|
|
|
|
### Logging
|
|
|
|
The logging of the extraction plugin is displayed in the console.
|
|
|
|
## Locally with Docker
|
|
|
|
Debugging an extraction plugin via docker is a bit trickier. In order to debug in Python, a debugger must be added to
|
|
the extraction plugin. There are several debug modules for Python available, but one debug module that works well with
|
|
Visual Studio Code is [debugpy](https://github.com/microsoft/debugpy). This package is developed by Microsoft
|
|
specifically for use in Visual Studio Code with Python.
|
|
|
|
.. note:: `debugpy` implements the Debug Adapter Protocol (DAP), which is a standardised way for development tools to
|
|
communicate with debuggers.
|
|
|
|
Using `debugpy` with Docker containers requires 4 distinct steps:
|
|
|
|
1. Install `debugpy`
|
|
2. Configuring `debugpy` in Python
|
|
3. Build a docker image
|
|
4. Configuring the connection to the Docker container
|
|
5. Setting breakpoints in your code
|
|
|
|
### Install `debugpy`
|
|
|
|
First, add `debugpy` to your `setup.py`.
|
|
|
|
```python
|
|
from setuptools import setup
|
|
|
|
setup(
|
|
# ...
|
|
install_requires=[
|
|
"hansken-extraction-plugin==0.4.7", # the plugin SDK
|
|
"debugpy==1.5.1"
|
|
]
|
|
)
|
|
```
|
|
|
|
### Configuring `debugpy` in Python
|
|
|
|
At the beginning of your script, import `debugpy`, and call `debugpy.listen()` to start the debug adapter, passing
|
|
a `(host, port)` tuple as the first argument. Use the `debugpy.wait_for_client()` function to block program execution
|
|
until the client is attached.
|
|
|
|
```python
|
|
import debugpy
|
|
|
|
debugpy.listen(("0.0.0.0", 5678))
|
|
debugpy.wait_for_client() # blocks execution until client is attached
|
|
|
|
# your extraction plugin code
|
|
```
|
|
|
|
### Build a Docker image
|
|
|
|
If the Docker image is not built, first build the image as described
|
|
[here](getting_started.md#Building a docker image for a Python plugin).
|
|
|
|
|
|
### Configuring the connection to the Docker container
|
|
|
|
`debugpy` is now set up to accept connections inside a Docker container. To connect to `debugpy` in the docker
|
|
container, port 5678 must be published. To make a port available to services outside of Docker, use the --publish or -p
|
|
flag. This creates a firewall rule which maps a container port to a port on the Docker host to the outside world.
|
|
|
|
To run the extraction plugin with the published port the following command can be used:
|
|
|
|
```bash
|
|
docker run -p 5678:5678 your_extraction_plugin_name
|
|
```
|
|
|
|
The next step is to configure Visual Studio Code. A `launch.json` file must be created in order for Visual Studio Code
|
|
to connect to the extraction plugin in Docker. This minimal `launch.json` example below tells the debugger to attach
|
|
to `localhost` on port `5678`.
|
|
|
|
```json
|
|
{
|
|
"version": "0.2.0",
|
|
"configurations": [
|
|
{
|
|
"name": "Python: Remote Attach",
|
|
"type": "python",
|
|
"request": "attach",
|
|
"connect": {
|
|
"host": "localhost",
|
|
"port": 5678
|
|
},
|
|
"pathMappings": [
|
|
{
|
|
"localRoot": "${workspaceFolder}",
|
|
"remoteRoot": "."
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Setting breakpoints in the code
|
|
|
|
The last step is to add breakpoints in the code.
|
|
|
|
### Logging in Docker
|
|
|
|
The logging of the extraction plugin is displayed in the console after running the `docker run` command. In addition,
|
|
the logging is also displayed in the Visual Studio Code console while debugging.
|
|
|
|
## Kubernetes
|
|
|
|
In kubernetes it is currently _not_ possible to debug via `debugpy` because no debug ports are published.
|
|
|
|
### Logging in Kubernetes
|
|
|
|
If there is authorization to the kubernetes cluster, the logging can be viewed with the following command:
|
|
|
|
```bash
|
|
kubectl logs -f hansken-extraction-plugins/your_extraction_plugin_pod
|
|
```
|
|
|
|
## Debug HQL
|
|
|
|
An HQL query can be debugged by running the test framework with the `--verbose` option enabled. The found HQL matches
|
|
will then be displayed in the console. To test a plugin in python with the `--verbose` option enabled use the following
|
|
command:
|
|
|
|
```bash
|
|
test_plugin --standalone plugin/your_plugin.py --regenerate --verbose
|
|
```
|
|
|
|
The following output will then be displayed in the console:
|
|
|
|
```text
|
|
HQL match found for:
|
|
$data.type=jpg
|
|
With trace:
|
|
dataType=jpg
|
|
types={file, data}
|
|
properties={data.raw.mimeType=image/jpg, path=/test-input-trace, file.name=image.jpg, name=test-input-trace, id=0}
|
|
```
|
|
|
|
If the HQL query contains an error, it will be shown in the generated test results. An example of an invalid query
|
|
is ``$data.mimeType=image/jpg`` (slash not escaped). This query will produce an error like the one shown below.
|
|
|
|
```json
|
|
{
|
|
"class": "org.hansken.plugin.extraction.hql_lite.lang.ParseException",
|
|
"message": "HqlLiteHumanQueryParser: line 1:20 token recognition error at: '/jpg'"
|
|
}
|
|
```
|
|
|
|
.. note:: The error is only shown in the generated trace, so to find out the `ParseException` run the ``test_plugin``
|
|
command with the ``--regenerate`` option enabled.
|