Files
hansken-extraction-plugin-s…/0.7.0/dev/python/api_changelog.html
2023-06-23 09:55:36 +02:00

490 lines
41 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.18.1: http://docutils.sourceforge.net/" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Python API Changelog &mdash; Hansken Extraction Plugins for plugin developers 0.7.0
documentation</title>
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="stylesheet" href="../../_static/wider_pages.css" type="text/css" />
<!--[if lt IE 9]>
<script src="../../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="../../_static/jquery.js"></script>
<script src="../../_static/_sphinx_javascript_frameworks_compat.js"></script>
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
<script src="../../_static/doctools.js"></script>
<script src="../../_static/sphinx_highlight.js"></script>
<script src="../../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../../genindex.html" />
<link rel="search" title="Search" href="../../search.html" />
<link rel="next" title="Prerequisites" href="prerequisites.html" />
<link rel="prev" title="Python" href="../python.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="../../index.html" class="icon icon-home">
Hansken Extraction Plugins for plugin developers
</a>
<div class="version">
0.7.0
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="../concepts.html">General concepts</a></li>
<li class="toctree-l1"><a class="reference internal" href="../spec.html">Extraction Plugin specifications</a></li>
<li class="toctree-l1"><a class="reference internal" href="../java.html">Java</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../python.html">Python</a><ul class="current">
<li class="toctree-l2 current"><a class="current reference internal" href="#">Python API Changelog</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#id1">0.7.0</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id2">0.6.1</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id3">0.6.0</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#build-pipeline-change">Build pipeline change</a></li>
<li class="toctree-l4"><a class="reference internal" href="#api-changes">API changes</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="#id4">0.5.1</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id5">0.5.0</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id6">0.4.13</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id7">0.4.7</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id8">0.4.6</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id9">0.4.0</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id10">0.3.0</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id11">0.2.0</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="prerequisites.html">Prerequisites</a></li>
<li class="toctree-l2"><a class="reference internal" href="getting_started.html">Getting started</a></li>
<li class="toctree-l2"><a class="reference internal" href="packaging.html">Packaging</a></li>
<li class="toctree-l2"><a class="reference internal" href="snippets.html">Python code snippets</a></li>
<li class="toctree-l2"><a class="reference internal" href="testing.html">Advanced use of the Test Framework in Python</a></li>
<li class="toctree-l2"><a class="reference internal" href="hanskenpy.html">Run plugins with Hansken.py</a></li>
<li class="toctree-l2"><a class="reference internal" href="debugging.html">How to debug an Extraction Plugin</a></li>
<li class="toctree-l2"><a class="reference internal" href="../python.html#api-documentation">API Documentation</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../examples.html">Examples</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq.html">Frequently Asked Questions</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../contact.html">Contact</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../changes.html">Changelog</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../../index.html">Hansken Extraction Plugins for plugin developers</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../../index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item"><a href="../python.html">Python</a></li>
<li class="breadcrumb-item active">Python API Changelog</li>
<li class="wy-breadcrumbs-aside">
<a href="../../_sources/dev/python/api_changelog.md.txt" rel="nofollow"> View page source</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="python-api-changelog">
<h1>Python API Changelog<a class="headerlink" href="#python-api-changelog" title="Permalink to this heading"></a></h1>
<p>This document summarizes all important API changes in the Extraction Plugin API. This document only shows changes that
are important to plugin developers. For a full list of changes per version, please refer to the general
<a class="reference internal" href="../../changes.html#changelog"><span class="std std-ref">changelog</span></a>.</p>
<section id="id1">
<h2>0.7.0<a class="headerlink" href="#id1" title="Permalink to this heading"></a></h2>
<ul>
<li><p>Escaping the <code class="docutils literal notranslate"><span class="pre">/</span></code> character in matchers is optional.
This simplifies and aims for better HQL and HQL-Lite compatability.
See for more information and examples the <a class="reference internal" href="../concepts/hql_lite.html#hqllite-syntax"><span class="std std-ref">HQL-Lite syntax documentation</span></a>.</p>
<p>Examples:</p>
<ul class="simple">
<li><p>Old: <code class="docutils literal notranslate"><span class="pre">file.path:\/Users\/*\/AppData</span></code> -&gt; new: <code class="docutils literal notranslate"><span class="pre">file.path:/Users/*/AppData</span></code></p></li>
<li><p>Old: <code class="docutils literal notranslate"><span class="pre">file.path:\\/Users\\/*\\/AppData</span></code> -&gt; new: <code class="docutils literal notranslate"><span class="pre">file.path:/Users/*/AppData</span></code></p></li>
<li><p>Old: <code class="docutils literal notranslate"><span class="pre">registryEntry.key:\/Software\/Dropbox\/ks*\/Client-p</span></code> -&gt; new: <code class="docutils literal notranslate"><span class="pre">registryEntry.key:/Software/Dropbox/ks*/Client-p</span></code></p></li>
</ul>
</li>
<li><p>Hansken returns <code class="docutils literal notranslate"><span class="pre">file.path</span></code> properties (outside the scope of matchers) as a <code class="docutils literal notranslate"><span class="pre">String</span></code> property,
instead of a list of strings.
Example: <code class="docutils literal notranslate"><span class="pre">trace.get('file.path')</span></code> now returns <code class="docutils literal notranslate"><span class="pre">'/dev/null'</span></code>, this was <code class="docutils literal notranslate"><span class="pre">['dev',</span> <span class="pre">'null']</span></code>.</p></li>
<li><p>Improved plugin loading when using <code class="docutils literal notranslate"><span class="pre">serve_plugin</span></code> and <code class="docutils literal notranslate"><span class="pre">build_plugin</span></code>:
<code class="docutils literal notranslate"><span class="pre">import</span></code> statements now work for modules (python files) that are located the same directory structure of a plugin.</p></li>
<li><p>A plugin can now stream data to a trace using <code class="docutils literal notranslate"><span class="pre">trace.open(mode='wb')</span></code>.
This removes the limit on the size of data that could be written.
See also <a class="reference internal" href="snippets.html#python-snippets-data-streaming"><span class="std std-ref">the python code snippet</span></a>.</p>
<p>Example:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">with</span> <span class="n">trace</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="s1">&#39;wb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">writer</span><span class="p">:</span>
<span class="n">writer</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="sa">b</span><span class="s1">&#39;a string&#39;</span><span class="p">)</span>
<span class="n">writer</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="nb">bytes</span><span class="p">(</span><span class="n">another_string</span><span class="p">,</span> <span class="s1">&#39;utf-8&#39;</span><span class="p">))</span>
</pre></div>
</div>
<p><em>note</em>: this does not work when using <code class="docutils literal notranslate"><span class="pre">run_with_hanskenpy</span></code>.</p>
</li>
</ul>
</section>
<section id="id2">
<h2>0.6.1<a class="headerlink" href="#id2" title="Permalink to this heading"></a></h2>
<ul>
<li><p>The docker image build script <code class="docutils literal notranslate"><span class="pre">build_plugin</span></code> has been updated to allow for extension of the docker command.
This can be especially handy for specifying a proxy. You should build your plugin container image with the following
command:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>build_plugin<span class="w"> </span>PLUGIN_FILE<span class="w"> </span>DOCKER_FILE_DIRECTORY<span class="w"> </span><span class="o">[</span>DOCKER_IMAGE_NAME<span class="o">]</span><span class="w"> </span><span class="o">[</span>DOCKER_ARGS<span class="o">]</span>
</pre></div>
</div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Note that the <code class="docutils literal notranslate"><span class="pre">DOCKER_IMAGE_NAME</span></code> argument no longer requires a <code class="docutils literal notranslate"><span class="pre">-n</span></code> parameter to be specified.</p>
</div>
<p>For usage read further in <a class="reference internal" href="packaging.html"><span class="doc">packaging</span></a>.</p>
</li>
</ul>
</section>
<section id="id3">
<h2>0.6.0<a class="headerlink" href="#id3" title="Permalink to this heading"></a></h2>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>This is an API breaking change.
Upgrading your plugin to this version will require code changes.
Plugins built with previous versions of the SDK from <cite>0.3.0</cite> will still work with Hansken.</p>
</div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>It is strongly recommended to upgrade your plugins to this new version because it significantly improves
the start-up time of Hansken. See the migration steps below.</p>
</div>
<p>This release contains both build pipeline changes and API changes.
Please read all changes carefully.</p>
<section id="build-pipeline-change">
<h3>Build pipeline change<a class="headerlink" href="#build-pipeline-change" title="Permalink to this heading"></a></h3>
<ul>
<li><p>Extraction plugin container images are now labeled with PluginInfo. This
allows Hansken to efficiently load extraction plugins.
Migration steps from earlier versions:</p>
<ol class="arabic">
<li><p>Update the SDK version in your <code class="docutils literal notranslate"><span class="pre">setup.py</span></code> / <code class="docutils literal notranslate"><span class="pre">requirements.txt</span></code></p></li>
<li><p>If you come from a version prior to <code class="docutils literal notranslate"><span class="pre">0.4.0</span></code>, or if you use a plugin name
instead of a plugin id in your <code class="docutils literal notranslate"><span class="pre">pluginInfo()</span></code>, switch to the plugin id style
(read instructions for version <code class="docutils literal notranslate"><span class="pre">0.4.0</span></code>)</p></li>
<li><p>Update your build scripts to build your plugin (Docker) container image.
Be sure to <a class="reference internal" href="getting_started.html"><span class="doc">have the Extraction Plugins SDK installed</span></a>.
Then, you should build your plugin container image with the following command:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>build_plugin<span class="w"> </span>PLUGIN_FILE<span class="w"> </span>DOCKER_FILE_DIRECTORY<span class="w"> </span>-n<span class="w"> </span><span class="o">[</span>DOCKER_IMAGE_NAME<span class="o">]</span>
</pre></div>
</div>
<p>For example:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>build_plugin<span class="w"> </span>plugin/chatplugin.py<span class="w"> </span>.<span class="w"> </span>-n<span class="w"> </span>extraction-plugins/chatplugin
</pre></div>
</div>
<p>This will generate a plugin image:</p>
<ul class="simple">
<li><p>The extraction plugin is added to your local image registry (<code class="docutils literal notranslate"><span class="pre">docker</span> <span class="pre">images</span></code>),</p></li>
<li><p>Note that DOCKER_IMAGE_NAME is optional and will default to <code class="docutils literal notranslate"><span class="pre">extraction-plugin/PLUGINID</span></code>, e.g.
<code class="docutils literal notranslate"><span class="pre">extraction-plugin/nfi.nl/extract/chat/whatsapp</span></code>,</p></li>
<li><p>The image is tagged with two tags: <code class="docutils literal notranslate"><span class="pre">latest</span></code>, and your plugin version.</p></li>
</ul>
</li>
</ol>
</li>
</ul>
</section>
<section id="api-changes">
<h3>API changes<a class="headerlink" href="#api-changes" title="Permalink to this heading"></a></h3>
<ul>
<li><p>The field <code class="docutils literal notranslate"><span class="pre">plugin</span></code> has been removed from <code class="docutils literal notranslate"><span class="pre">PluginInfo</span></code>.</p></li>
<li><p>The field <code class="docutils literal notranslate"><span class="pre">pluginId</span></code> should now be the first argument of PluginInfo (when using unnamed arguments).</p>
<p>Old (unnamed arguments):</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">plugin_info</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="s1">&#39;1.0.0&#39;</span><span class="p">,</span> <span class="s1">&#39;description&#39;</span><span class="p">,</span> <span class="n">author</span><span class="p">,</span>
<span class="n">MaturityLevel</span><span class="o">.</span><span class="n">PROOF_OF_CONCEPT</span><span class="p">,</span> <span class="s1">&#39;*, &#39;</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">hansken</span><span class="o">.</span><span class="n">org</span><span class="s1">&#39;,</span>
<span class="n">PluginId</span><span class="p">(</span><span class="o">...</span><span class="p">),</span> <span class="s1">&#39;Apache License 2.0&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>New (removed <code class="docutils literal notranslate"><span class="pre">self</span></code>, and moved <code class="docutils literal notranslate"><span class="pre">PluginId(...)</span></code> to first argument position):</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">plugin_info</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="n">PluginId</span><span class="p">(</span><span class="o">...</span><span class="p">),</span> <span class="s1">&#39;1.0.0&#39;</span><span class="p">,</span> <span class="s1">&#39;description&#39;</span><span class="p">,</span>
<span class="n">author</span><span class="p">,</span> <span class="n">MaturityLevel</span><span class="o">.</span><span class="n">PROOF_OF_CONCEPT</span><span class="p">,</span>
<span class="s1">&#39;*&#39;</span><span class="p">,</span> <span class="s1">&#39;https://hansken.org&#39;</span><span class="p">,</span> <span class="s1">&#39;Apache License 2.0&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>Old (named arguments):</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">plugin_info</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="n">plugin</span><span class="o">=</span><span class="bp">self</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="s1">&#39;1.0.0&#39;</span><span class="p">,</span>
<span class="o">...</span><span class="p">)</span>
</pre></div>
</div>
<p>New (removed <code class="docutils literal notranslate"><span class="pre">plugin=self</span></code>):</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">plugin_info</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="n">version</span><span class="o">=</span><span class="s1">&#39;1.0.0&#39;</span><span class="p">,</span>
<span class="o">...</span><span class="p">)</span>
</pre></div>
</div>
</li>
<li><p>Plugin <code class="docutils literal notranslate"><span class="pre">data_context.data_size</span></code> is now a variable instead of a method:</p>
<p>Old:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">trace</span><span class="p">:</span> <span class="n">ExtractionTrace</span><span class="p">,</span> <span class="n">data_context</span><span class="p">:</span> <span class="n">DataContext</span><span class="p">):</span>
<span class="n">size</span> <span class="o">=</span> <span class="n">data_context</span><span class="o">.</span><span class="n">data_size</span><span class="p">()</span>
</pre></div>
</div>
<p>New:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">trace</span><span class="p">:</span> <span class="n">ExtractionTrace</span><span class="p">,</span> <span class="n">data_context</span><span class="p">:</span> <span class="n">DataContext</span><span class="p">):</span>
<span class="n">size</span> <span class="o">=</span> <span class="n">data_context</span><span class="o">.</span><span class="n">data_size</span>
</pre></div>
</div>
</li>
<li><p>Simplify declaring required runtime resources in a plugins info.</p>
<p>Extraction plugin resources dont use the builder pattern anymore.</p>
<p>Old:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span>
<span class="o">...</span><span class="p">,</span>
<span class="n">resources</span><span class="o">=</span><span class="n">PluginResources</span><span class="o">.</span><span class="n">builder</span><span class="p">()</span><span class="o">.</span><span class="n">maximum_cpu</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span><span class="o">.</span><span class="n">maximum_memory</span><span class="p">(</span><span class="mi">1000</span><span class="p">)</span><span class="o">.</span><span class="n">build</span><span class="p">())</span>
<span class="p">)</span>
</pre></div>
</div>
<p>New:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># no need for a builder, declare resources by direct instantiation</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span>
<span class="o">...</span><span class="p">,</span>
<span class="n">resources</span><span class="o">=</span><span class="n">PluginResources</span><span class="p">(</span><span class="n">maximum_cpu</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span> <span class="n">maximum_memory</span><span class="o">=</span><span class="mi">2048</span><span class="p">)</span>
<span class="p">)</span>
<span class="c1"># or, as before, specify just on resource</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span>
<span class="o">...</span><span class="p">,</span>
<span class="n">resources</span><span class="o">=</span><span class="n">PluginResources</span><span class="p">(</span><span class="n">maximum_memory</span><span class="o">=</span><span class="mi">4096</span><span class="p">)</span>
<span class="p">)</span>
</pre></div>
</div>
</li>
</ul>
</section>
</section>
<section id="id4">
<h2>0.5.1<a class="headerlink" href="#id4" title="Permalink to this heading"></a></h2>
<ul>
<li><p>Simplify tracelet properties by making the tracelet type prefix optional.</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># using a Tracelet object</span>
<span class="n">trace</span><span class="o">.</span><span class="n">add_tracelet</span><span class="p">(</span><span class="n">Tracelet</span><span class="p">(</span><span class="s2">&quot;prediction&quot;</span><span class="p">,</span> <span class="p">{</span>
<span class="s2">&quot;type&quot;</span><span class="p">:</span> <span class="s2">&quot;example&quot;</span><span class="p">,</span>
<span class="s2">&quot;confidence&quot;</span><span class="p">:</span> <span class="mf">0.8</span>
<span class="p">}))</span>
<span class="c1"># or without a Tracelet object</span>
<span class="n">trace</span><span class="o">.</span><span class="n">add_tracelet</span><span class="p">(</span><span class="s2">&quot;identity&quot;</span><span class="p">,</span> <span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;John Doe&quot;</span><span class="p">,</span> <span class="s2">&quot;status&quot;</span><span class="p">:</span> <span class="s2">&quot;online&quot;</span><span class="p">})</span>
</pre></div>
</div>
</li>
<li><p>Enabled <em>manual</em> plugin testing, as described on <a class="reference internal" href="testing.html#python-testing"><span class="std std-ref">advanced use of the test framework in Python</span></a>.</p></li>
</ul>
</section>
<section id="id5">
<h2>0.5.0<a class="headerlink" href="#id5" title="Permalink to this heading"></a></h2>
<ul>
<li><p>Support vector data type in trace properties.</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">embedding</span> <span class="o">=</span> <span class="n">Vector</span><span class="o">.</span><span class="n">from_sequence</span><span class="p">((</span><span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">))</span>
<span class="n">tracelet</span> <span class="o">=</span> <span class="n">Tracelet</span><span class="p">(</span><span class="s2">&quot;prediction&quot;</span><span class="p">,</span> <span class="p">{</span>
<span class="s2">&quot;prediction.type&quot;</span><span class="p">:</span> <span class="s2">&quot;example-vector&quot;</span><span class="p">,</span>
<span class="s2">&quot;prediction.embedding&quot;</span><span class="p">:</span> <span class="n">embedding</span>
<span class="p">})</span>
<span class="n">trace</span><span class="o">.</span><span class="n">add_tracelet</span><span class="p">(</span><span class="n">tracelet</span><span class="p">)</span>
</pre></div>
</div>
</li>
</ul>
</section>
<section id="id6">
<h2>0.4.13<a class="headerlink" href="#id6" title="Permalink to this heading"></a></h2>
<ul class="simple">
<li><p>When writing input search traces for tests, it is no longer required to explicitly set an <code class="docutils literal notranslate"><span class="pre">id</span></code> property.
These are automatically generated when executing tests.</p></li>
</ul>
</section>
<section id="id7">
<h2>0.4.7<a class="headerlink" href="#id7" title="Permalink to this heading"></a></h2>
<ul class="simple">
<li><p>More <code class="docutils literal notranslate"><span class="pre">$data</span></code> matchers are supported in Hansken.py plugin runner. Before this improvement it was only possible to match
on <code class="docutils literal notranslate"><span class="pre">$data.type</span></code>. Now it is also possible to match for example on <code class="docutils literal notranslate"><span class="pre">$data.mimeType</span></code> and <code class="docutils literal notranslate"><span class="pre">$data.mimeClass</span></code>. The <code class="docutils literal notranslate"><span class="pre">$data</span></code>
matcher should still be at the end of the query as before.</p></li>
</ul>
</section>
<section id="id8">
<h2>0.4.6<a class="headerlink" href="#id8" title="Permalink to this heading"></a></h2>
<ul>
<li><p>It is now possible to specify maximum system resources in the <code class="docutils literal notranslate"><span class="pre">PluginInfo</span></code>. To run a plugin with 0.5 cpu (= 0.5
vCPU/Core/hyperthread) and 1 gb memory, for example, the following configuration can be added to <code class="docutils literal notranslate"><span class="pre">PluginInfo</span></code>:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">plugin_info</span> <span class="o">=</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="o">...</span><span class="p">,</span>
<span class="n">resources</span><span class="o">=</span><span class="n">PluginResources</span><span class="o">.</span><span class="n">builder</span><span class="p">()</span><span class="o">.</span><span class="n">maximum_cpu</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span><span class="o">.</span><span class="n">maximum_memory</span><span class="p">(</span><span class="mi">1000</span><span class="p">)</span><span class="o">.</span><span class="n">build</span><span class="p">())</span>
</pre></div>
</div>
</li>
</ul>
</section>
<section id="id9">
<h2>0.4.0<a class="headerlink" href="#id9" title="Permalink to this heading"></a></h2>
<ul>
<li><p>Extraction Plugins are now identified with a <code class="docutils literal notranslate"><span class="pre">PluginInfo.PluginId</span></code> containing a domain, category and name. The
method <code class="docutils literal notranslate"><span class="pre">PluginInfo.name(pluginName)</span></code> has been replaced by <code class="docutils literal notranslate"><span class="pre">PluginInfo.id(new</span> <span class="pre">PluginId(domain,</span> <span class="pre">category,</span> <span class="pre">name)</span></code>. More
details on the plugin naming conventions can be found at the <a class="reference internal" href="../concepts/plugin_naming_convention.html"><span class="doc">Plugin naming convention</span></a> section.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">PluginInfo.name()</span></code> is now deprecated (but will still work for backwards compatibility).</p></li>
<li><p>A new license field <code class="docutils literal notranslate"><span class="pre">PluginInfo.license</span></code> has also been added in this release.</p></li>
<li><p>The following example creates a PluginInfo for a plugin with the name <code class="docutils literal notranslate"><span class="pre">TestPlugin</span></code>, licensed under
the <code class="docutils literal notranslate"><span class="pre">Apache</span> <span class="pre">License</span> <span class="pre">2.0</span></code> license:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">TestPlugin</span><span class="p">(</span><span class="n">ExtractionPlugin</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">plugin_info</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">PluginInfo</span><span class="p">:</span>
<span class="k">return</span> <span class="n">PluginInfo</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="s1">&#39;1.0.0&#39;</span><span class="p">,</span>
<span class="n">description</span><span class="o">=</span><span class="s1">&#39;A plugin for testing.&#39;</span><span class="p">,</span>
<span class="n">author</span><span class="o">=</span><span class="n">Author</span><span class="p">(</span><span class="s1">&#39;The Externals&#39;</span><span class="p">,</span> <span class="s1">&#39;tester@holmes.nl&#39;</span><span class="p">,</span> <span class="s1">&#39;NFI&#39;</span><span class="p">),</span>
<span class="n">maturity</span><span class="o">=</span><span class="n">MaturityLevel</span><span class="o">.</span><span class="n">PROOF_OF_CONCEPT</span><span class="p">,</span>
<span class="n">webpage_url</span><span class="o">=</span><span class="s1">&#39;https://hansken.org&#39;</span><span class="p">,</span>
<span class="n">matcher</span><span class="o">=</span><span class="s1">&#39;file.extension=txt&#39;</span><span class="p">,</span>
<span class="nb">id</span><span class="o">=</span><span class="n">PluginId</span><span class="p">(</span><span class="n">domain</span><span class="o">=</span><span class="s1">&#39;nfi.nl&#39;</span><span class="p">,</span> <span class="n">category</span><span class="o">=</span><span class="s1">&#39;test&#39;</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">&#39;TestPlugin&#39;</span><span class="p">),</span>
<span class="n">license</span><span class="o">=</span><span class="s1">&#39;Apache License 2.0&#39;</span>
<span class="p">)</span>
</pre></div>
</div>
</li>
</ul>
</section>
<section id="id10">
<h2>0.3.0<a class="headerlink" href="#id10" title="Permalink to this heading"></a></h2>
<ul>
<li><p>Extraction Plugins can now create new datastreams on a Trace through data transformations. Data transformations
describe how data can be obtained from a source.</p>
<p>An example case is an extraction plugin that processes an archive file. The plugin creates a child trace per entry in
the archive file. Each child trace will have a datastream that is a transformation that marks the start and length of
the entry in the original archive data. By just describing the data instead of specifying the actual data, a lot of
space is saved.</p>
<p>Although Hansken supports various transformations, the Extraction Plugins SDK for now only supports ranged data
transformations. Ranged data transformations define data as a list of ranges, each range with an offset and length in
a bytearray.</p>
<p>The following example sets a new datastream with dataType <code class="docutils literal notranslate"><span class="pre">html</span></code> on a trace, by setting a ranged data transformation:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">trace</span><span class="o">.</span><span class="n">add_transformation</span><span class="p">(</span><span class="s1">&#39;html&#39;</span><span class="p">,</span> <span class="n">RangedTransformation</span><span class="p">(</span><span class="n">Range</span><span class="p">(</span><span class="n">offset</span><span class="p">,</span> <span class="n">length</span><span class="p">)))</span>
</pre></div>
</div>
<p>The following example creates a child trace and sets a new datastream with dataType <code class="docutils literal notranslate"><span class="pre">raw</span></code> on it, by setting a ranged
data transformation with two ranges:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">child</span> <span class="o">=</span> <span class="n">trace</span><span class="o">.</span><span class="n">child_builder</span><span class="p">(</span><span class="s1">&#39;new trace&#39;</span><span class="p">)</span>
<span class="n">child</span><span class="o">.</span><span class="n">add_transformation</span><span class="p">(</span><span class="s1">&#39;raw&#39;</span><span class="p">,</span> <span class="n">RangedTransformation</span><span class="o">.</span><span class="n">builder</span><span class="p">()</span>
<span class="o">.</span><span class="n">add_range</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">)</span>
<span class="o">.</span><span class="n">add_range</span><span class="p">(</span><span class="mi">50</span><span class="p">,</span> <span class="mi">30</span><span class="p">)</span>
<span class="o">.</span><span class="n">build</span><span class="p">())</span>
<span class="p">});</span>
</pre></div>
</div>
<p>More detailed documentation will follow in an upcoming SDK release.</p>
</li>
</ul>
</section>
<section id="id11">
<h2>0.2.0<a class="headerlink" href="#id11" title="Permalink to this heading"></a></h2>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>This is an API breaking change.
Plugins created with an earlier version of the extraction plugin
SDK are not compatible with Hansken that uses <cite>0.2.0</cite> or later.</p>
</div>
<ul>
<li><p>Introduced a new extraction plugin type <code class="docutils literal notranslate"><span class="pre">api.extraction_plugin.DeferredExtractioPlugin</span></code>.
Deferred Extraction plugins can be run at a different extraction stage.
This type of plugin also allows accessing other traces using the searcher.</p></li>
<li><p>The class <code class="docutils literal notranslate"><span class="pre">api.extraction_context.ExtractionContext</span></code> has been renamed to <code class="docutils literal notranslate"><span class="pre">api.data_context.DataContext</span></code>.
The new name <code class="docutils literal notranslate"><span class="pre">DataContext</span></code> represents the class contents better.
Plugins have to update matching import statements accordingly.
Plugins should also update the named argument <code class="docutils literal notranslate"><span class="pre">context</span></code> to <code class="docutils literal notranslate"><span class="pre">data_context</span></code> of the plugin <code class="docutils literal notranslate"><span class="pre">process()</span></code> method.
This change has no functional changes.</p>
<p>Old:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.extraction_context</span> <span class="kn">import</span> <span class="n">ExtractionContext</span>
<span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">trace</span><span class="p">,</span> <span class="n">context</span><span class="p">):</span>
<span class="k">pass</span>
</pre></div>
</div>
<p>New:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.data_context</span> <span class="kn">import</span> <span class="n">DataContext</span>
<span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">trace</span><span class="p">,</span> <span class="n">data_context</span><span class="p">):</span>
<span class="k">pass</span>
</pre></div>
</div>
</li>
<li><p>Moved <code class="docutils literal notranslate"><span class="pre">api.author.Author</span></code> to <code class="docutils literal notranslate"><span class="pre">api.plugin_info.Author</span></code>, and moved <code class="docutils literal notranslate"><span class="pre">api.maturity_level.MaturityLevel</span></code>
to <code class="docutils literal notranslate"><span class="pre">api.plugin_info.MaturityLevel</span></code>
This is a more <em>pythonic</em> way of grouping of classes into modules. This change has no functional side effects.</p>
<p>Plugins have to update matching import statements accordingly.</p>
<p>Old:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.author</span> <span class="kn">import</span> <span class="n">Author</span>
<span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.maturity_level</span> <span class="kn">import</span> <span class="n">MaturityLevel</span>
<span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.plugin_info</span> <span class="kn">import</span> <span class="n">PluginInfo</span>
</pre></div>
</div>
<p>New:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">hansken_extraction_plugin.api.plugin_info</span> <span class="kn">import</span> <span class="n">Author</span><span class="p">,</span> <span class="n">MaturityLevel</span><span class="p">,</span> <span class="n">PluginInfo</span>
</pre></div>
</div>
</li>
<li><p>Removed <code class="docutils literal notranslate"><span class="pre">DataContext.get_first_bytes()</span></code> from the public API.</p></li>
<li><p>Removed <code class="docutils literal notranslate"><span class="pre">api.extraction_trace.validate_update_arguments(..)</span></code> from the public API. This method is still invoked
implicitly when setting trace properties.</p></li>
</ul>
</section>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="../python.html" class="btn btn-neutral float-left" title="Python" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="prerequisites.html" class="btn btn-neutral float-right" title="Prerequisites" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>&#169; Copyright 2020-2023 Netherlands Forensic Institute.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>