GP-4146 Updates to BSim documentation

This commit is contained in:
caheckman 2023-12-12 22:19:33 +00:00
parent 4c3d7ca925
commit bf4c6b5232
8 changed files with 269 additions and 116 deletions

View File

@ -33,6 +33,24 @@ documentation) from the command-line by just running
This will dump logging messages to the console, and you should see '[lsh]'
listed among the loaded plug-ins as the node starts up.
This will typically start the database with password authentication enabled. An
'elastic' user will be automatically created with a randomly generated password that
gets printed to the console the first time the node is started. To add additional
users, use a curl command like
curl -k -u elastic:XXXXXX -X POST "https://localhost:9200/_security/user/ghidrauser?pretty" -H 'Content-Type: application/json' -d'
{
"password" : "changeme",
"roles" : [ "superuser" ],
"full_name" : "Ghidra User",
"email" : "ghidrauser@example.com"
}
'
Replace XXXXXX with the generated password for the 'elastic' user. This example
creates a user 'ghidrauser', with administrator privileges. The built-in role
'viewer' can be used to create users with read-only access to the database.
Once the Elasticsearch node(s) are running, whether they are a toy or a full
deployment, you can immediately proceed to the BSim 'bsim' command.
The Ghidra/BSim client and 'bsim' command automatically assume an
@ -60,7 +78,7 @@ documentation included with Ghidra for full details.
Version:
The current BSim plug-in was designed and tested with Elasticsearch version 7.17.4.
The current BSim plug-in was tested with Elasticsearch version 8.8.1.
A change to the Elasticsearch scripting interface, starting with version 7.15, makes the BSim
plug-in incompatible with previous versions, but the lsh plug-in jars may work without change
across later Elasticsearch versions.
@ -75,7 +93,7 @@ the zip file. Within the zip archive, the version number is stored in a configur
The file format is fairly simple: edit the line
elasticsearch.version=7.17.4
elasticsearch.version=8.8.1
The plugin may work with other nearby versions, but proceed at your own risk.

View File

@ -4,7 +4,7 @@
<config key="checkpoint_timeout">30min</config> <!-- Amount of time before all database records are flushed to disk -->
<config key="listen_addresses">'*'</config> <!-- '*' = all available, '0.0.0.0' just IPv4, 'localhost' -->
<config key="ssl">on</config> <!-- Enable server to connect via SSL -->
<!-- <config key="ssl_ciphers">TLSv1.2</config> -->
<!-- <config key="ssl_min_protocol_version">TLSv1.3</config> -->
<config key="password_encryption">scram-sha-256</config>
<!-- <connect db="all" user="all" type="local" method="trust"/> -->

View File

@ -15,23 +15,35 @@
# limitations under the License.
##
#
# This script may be used to build the postgresql server within
# a GHIDRA installation. The postgresql server configuration options
# below (POSTGRES_CONFIG_OPTIONS) may be adjusted if required
# (e.g., build without openssl use, etc.).
# This script builds the postgresql server and BSim extension within a
# GHIDRA installation.
#
# The PostgreSQL source distribution file postgresql-15.3.tar.gz must
# be placed in the BSim module directory prior to running this script.
# This file can be downloaded directly from the PostgreSQL website at:
#
# https://www.postgresql.org/ftp/source/v15.3
#
# Within development environments, this script will first check the
# ghidra.bin repo for this source file.
#
# The postgresql server configuration options below
# (POSTGRES_CONFIG_OPTIONS) may be adjusted if required (e.g., build
# without openssl use, etc.).
#
# See https://www.postgresql.org/docs/15/install-procedure.html
# for supported postgresql config options.
#
# Additional packages may need to be installed include to perform the
# Additional software may need to be installed in order to perform the
# postgresql build. Please refer to the following web page for
# package dependencies:
# software dependencies:
#
# https://www.postgresql.org/docs/current/install-requirements.html
#
# Or for Linux specific package dependencies, see:
#
# https://wiki.postgresql.org/wiki/Compile_and_Install_from_source_code
#
# The postgresql source distribution should reside within the BSim module
# directory prior to running this script. Within development environments
# it will first check the ghidra.bin repo for this source file.
#
POSTGRES=postgresql-15.3

View File

@ -184,7 +184,7 @@
<H3 class="title">Note</H3>
<P>The PostgreSQL server software is currently only supported for the <SPAN class=
"emphasis"><EM>Linux</EM></SPAN> and <SPAN class="emphasis"><EM>MacOS</EM></SPAN>
"emphasis"><EM>Linux</EM></SPAN> and <SPAN class="emphasis"><EM>macOS</EM></SPAN>
architectures. Elasticsearch server software must be obtained separately. Small local
file-based databases are supported on all platforms via an embedded H2 database
engine. The BSim client

View File

@ -56,11 +56,12 @@
</DIV>
<P><SPAN class="command"><STRONG>bsim_ctl</STRONG></SPAN> is a command-line utility for
starting and stopping a BSim server using the PostgreSQL back-end that is prepackaged with
the Ghidra distribution. All commands must be run on the machine hosting the server.
Optional parameters for a given command are indicated by square brackets '[' and ']'.
Options with an '=' character require a user specified value. If the value string requires
space characters, it should be enclosed in double quotes.</P>
starting and stopping a BSim server using the PostgreSQL back-end. The utility cannot be
used with either an Elasticsearch server or a local H2 database.
All commands must be run on the machine hosting the server.
Optional parameters for a given command are indicated by square brackets '[' and ']'.
Options with an '=' character require a user specified value. If the value string requires
space characters, it should be enclosed in double quotes.</P>
<DIV class="informalexample">
<DIV class="variablelist">
@ -411,7 +412,7 @@
<P>Generates function signatures and metadata for all program files retrieved from
a Ghidra Server repository or project as specified by a Ghidra URL. The generated
signatures may be retained as XML "sigs_" files within a specified XML storage
directory and/or commited to a specified BSim database specified with the <SPAN
directory and/or committed to a specified BSim database specified with the <SPAN
class="command"><STRONG>bsim=</STRONG></SPAN><SPAN class=
"emphasis"><EM>bsimURL</EM></SPAN> option. If an XML storage directory is not
specified, a BSim URL must be specified to which the data will be committed.</P>
@ -439,7 +440,7 @@
<DD>
<P>Commit previously generated signatures and metadata (see
<STRONG>signaturerepo</STRONG>) to a BSim repository. A URL specifying the BSim
<STRONG>generatesigs</STRONG>) to a BSim repository. A URL specifying the BSim
repository and a path to a directory containing the "sigs_" XML files to commit are
required.</P>
@ -459,7 +460,7 @@
and metadata generated (see <STRONG>generatesigs</STRONG>). Only metadata: names,
function tags, categories, etc. are changed. Signatures are not affected. The
generated updates may be retained as XML "update_" files within a specified XML
storage directory and/or commited to a specified BSim database specified with the
storage directory and/or committed to a specified BSim database specified with the
<SPAN class="command"><STRONG>bsim=</STRONG></SPAN><SPAN class=
"emphasis"><EM>bsimURL</EM></SPAN> option. If an XML storage directory is not
specified, a BSim URL must be specified to which the data will be committed.</P>
@ -615,8 +616,8 @@
not enough to uniquely specify the executable.</P>
<P><SPAN class="command"><STRONG>--printselfsig</STRONG></SPAN> - If specified, each
function listed will be prefixed by a calculated self-significance score. This value is
expressed as a decimal value.</P>
function listed will be prefixed by a calculated self-significance score. This score is
expressed as a floating-point value.</P>
<P><SPAN class="command"><STRONG>--callgraph</STRONG></SPAN> - If specified, a list
of all library functions called by the identified executable will be listed after
@ -751,7 +752,7 @@
<EM>*.gpr</EM> locator file must be specified with the project name. The project name
should exclude any <EM>.gpr/.rep</EM> suffix. Only the '/' character should be used as a
directory separator. In addition, when running on Windows, the directory path should
include its drive desigation preceeded by a '/' (e.g., <CODE class=
include its drive designation preceded by a '/' (e.g., <CODE class=
"computeroutput">ghidra:/C:/mydir/myproject?/folderA/folderB</CODE>).</P>
</DIV>
@ -812,7 +813,7 @@
<P>For local <EM>file</EM> URLs, the absolute path the H2 database <EM>*.mv.db</EM> file
must be specified without the <EM>*.mv.db</EM> extension. Only the '/' character should be
used as a directory separator. In addition, when running on Windows, the directory path
should include its drive desigation preceeded by a '/' (e.g., <CODE class=
should include its drive designation preceded by a '/' (e.g., <CODE class=
"computeroutput">file:/C:/mydir/mydb</CODE>).</P>
</DIV>
</DIV>

View File

@ -41,26 +41,35 @@
and configuration however, the two servers are completely separate.</P>
<P>There are two choices for deploying a shared server for the BSim Database: PostgreSQL or
Elasticsearch. In addition, a local file-based database may be employed which utilizes an
integrated H2 Database engine. This file-based database is intended for smaller datasets
and its use is limited to a single process.</P>
Elasticsearch. Both options support multiple users and allow multiple simultaneous connections
to the remote server. A single database can ingest large datasets, while
still maintaining short query times.</P>
<P>
Alternately, a single user can create a BSim database on their local file-system,
without a server, by utilizing the H2 database engine integrated into Ghidra. This option
is intended for querying against small datasets and does not require installation of
additional server software.</P>
<P>PostgreSQL software, including the extension necessary for BSim signature indexing,
comes prepackaged with the Ghidra distribution. It runs on a single host and makes
efficient use of whatever CPU, memory, and disk resources are made available to it.
PostgreSQL is a highly robust and capable server that should perform well on minimally
configured workstations up to high-end production hardware.</P>
<P>A PostgreSQL server, which must be built with a BSim specific extension,
runs on a single host and makes efficient use of whatever CPU, memory, and disk resources
are made available to it. PostgreSQL is a robust and capable server that should perform well
on minimally configured workstations up to high-end production hardware. Source for the
BSim extension to PostgreSQL is included as part of the Ghidra installation, but the
PostgreSQL source may need to be obtained separately by the database administrator.
See <A class="xref" href="DatabaseConfiguration.html#PostBuild">&ldquo;Building the Server&rdquo;</A>
</P>
<P>An Elasticsearch BSim plug-in is included with the Ghidra distribution, but the core
server software must be obtained separately by the database administrator. Elasticsearch is
a scalable text search and analytics database. It automatically distributes itself across
machines in a cluster, allowing individual database queries and requests to be serviced in
parallel. Support for BSim in Elasticsearch should still be considered in prototype, but
all major functionality has been implemented, and the BSim schema takes full advantage of
Elasticsearch as a distributed database.</P>
<P>An Elasticsearch server, which must have a BSim specific plug-in installed, runs
as a scalable database that automatically distributes itself across machines in a cluster,
allowing individual database queries and requests to be serviced in parallel. The Elasticsearch
BSim plug-in is included with the Ghidra installation, but the core server software must be obtained
separately by the database administrator.
See <A class="xref" href="DatabaseConfiguration.html#ElasticInstall">&ldquo;Installing the Plug-in&rdquo;</A>
</P>
<P>BSim clients included in the base Ghidra distribution can interface to any of these
databases.</P>
databases. Users that just want to connect to an existing shared server via a BSim client do not need to
install any server software themselves.</P>
</DIV>
<DIV class="section">
@ -82,35 +91,104 @@
</DIV>
</DIV>
<P>The base Ghidra distribution comes with the PostgreSQL software and the extensions
necessary for supporting a BSim database. The PostgreSQL server is most easily managed
using the <SPAN class="bold"><STRONG>bsim_ctl</STRONG></SPAN> command-line script. When
<SPAN class="bold"><STRONG>bsim_ctl start</STRONG></SPAN> is run for the first time (see
below), the PostgreSQL software is unpacked, depending on the host OS, to either</P>
<DIV class="sect3">
<DIV class="titlepage">
<DIV>
<DIV>
<H4 class="title"><A name="PostBuild"></A>Building the Server</H4>
</DIV>
</DIV>
</DIV>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">$(ROOT)/Ghidra/Features/BSim/os/linux64/postgresql
OR</CODE></TD>
</TR>
<P>In order to use PostgreSQL as a BSim server, it must be built with a BSim specific
extension, provided as part of the Ghidra installation. Prebuilt servers, like those
provided as OS distribution packages, will not work as is with BSim. For users on Linux
and macOS, the Ghidra installation provides a script, <CODE>make-postgres.sh</CODE>,
in the module directory <CODE>Ghidra/Features/BSim</CODE> that builds both the PostgreSQL
server and the BSim extension from source and prepares the installation for use with
Ghidra. If not already included in the Ghidra installation, the source distribution
file, currently <CODE>postgresql-15.3.tar.gz</CODE>, can be obtained from the PostgreSQL
website at </P>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">https://www.postgresql.org/ftp/source/v15.3
</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>The steps to build the PostgreSQL server with the BSim extension then are:</P>
<TR>
<TD><CODE class=
"computeroutput">$(ROOT)/Ghidra/Features/BSim/os/osx64/postgresql</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>1) If not already present, place the PostgreSQL source distribution file
<CODE>postgresql-15.3.tar.gz</CODE> in the Ghidra installation at</P>
<P>BSim will not operate with PostgreSQL without the Ghidra specific extensions, but
otherwise the provided installation is standard. It can be configured just like any other
stand-alone PostgreSQL server. PostgreSQL is highly configurable, and there are no direct
restrictions on modifying the configuration values. A default configuration is provided
with this installation that has been tuned specifically for the BSim Database
application, so in practice there may be little reason to modify it. But there are a few
standard configuration values for the server that might need adjusting. These do impact
important aspects of the server, like the amount of memory allocated to the server and
access restrictions.</P>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">$(ROOT)/Ghidra/Features/BSim/postgresql-15.3.tar.gz
</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>2) From the command-line, within the same directory, run the script <CODE>make-postgres.sh</CODE></P>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">cd $(ROOT)/Ghidra/Features/BSim
</CODE></TD>
</TR>
<TR>
<TD><CODE class="computeroutput">./make-postgres.sh
</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>Additional packages or software may need to be installed on the host OS in order for the
build to complete successfully, OpenSSL in particular is required for BSim. For the
full list of PostgreSQL software dependencies, refer to:</P>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">https://www.postgresql.org/docs/current/install-requirements.html
</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>Once the build has completed successfully,
the <SPAN class="bold"><STRONG>bsim_ctl</STRONG></SPAN> command-line script is ready to use
for starting a server (see
<A class="xref" href="DatabaseConfiguration.html#PostStartStop">&ldquo;Starting and Stopping the Server&rdquo;</A>).
The PostgreSQL server software will run out of the Ghidra installation at</P>
<DIV class="informalexample">
<TABLE border="0" summary="Simple list" class="simplelist">
<TR>
<TD><CODE class="computeroutput">$(ROOT)/Ghidra/Features/BSim/build/os/linux64/postgresql
OR</CODE></TD>
</TR>
<TR>
<TD><CODE class=
"computeroutput">$(ROOT)/Ghidra/Features/BSim/build/os/osx64/postgresql</CODE></TD>
</TR>
</TABLE>
</DIV>
<P>Other than having the extension itself, a BSim enabled PostgreSQL server is completely standard,
and can be configured like any other stand-alone PostgreSQL server. There are no direct restrictions
on modifying the configuration values. A default configuration is provided with this installation that
has been tuned specifically for the BSim Database application, so in practice there may be little reason to
modify it. But there are a few standard configuration values for the server that might need
adjusting. See
<A class="xref" href="DatabaseConfiguration.html#PostAdditionalConfig">&ldquo;Additional Configuration&rdquo;</A>.</P>
</DIV>
<DIV class="sect3">
<DIV class="titlepage">
@ -405,13 +483,11 @@
</DD>
<DT><SPAN class="term"><SPAN class=
"bold"><STRONG>ssl_cipher</STRONG></SPAN></SPAN></DT>
"bold"><STRONG>ssl_min_protocol_version</STRONG></SPAN></SPAN></DT>
<DD>
<P>This controls which ciphers the server allows when negotiating a connection.
The defaults are reasonable, but administrators may want more control. The
setting 'TLSv1.2', for instance, can be used to be compliant with the latest
TLS standard.</P>
<P>This controls the minimum SSL/TLS protocol version used when the server negotiates a connection.
The current default is 'TLSv1.2'</P>
</DD>
</DL>
</DIV>
@ -462,13 +538,18 @@
</DIV>
</DIV>
<P>A full description of how to configure an Elasticsearch cluster, including how to
start and stop the server, is beyond the scope of this document. In particular, the <SPAN
class="command"><STRONG>bsim_ctl</STRONG></SPAN> command-line, as described in <A class=
"xref" href="DatabaseConfiguration.html#PostConfig" title=
"PostgreSQL Configuration">&ldquo;PostgreSQL Configuration&rdquo;</A>, does not apply to
Elasticsearch. Complete documentation is available on-line from the Elasticsearch
website.</P>
<P>A full description of how to configure an Elasticsearch cluster is beyond the scope of
this document. In particular, the <SPAN class="command"><STRONG>bsim_ctl</STRONG></SPAN>
command-line, as described in <A class="xref" href="DatabaseConfiguration.html#PostConfig" title=
"PostgreSQL Configuration">&ldquo;PostgreSQL Configuration&rdquo;</A>, does not apply to
Elasticsearch. Complete documentation for administering a database is available on-line
from the Elasticsearch website.</P>
<P>The following discussion describes how to set up a toy, or single node, server, using the
free and open Elasticsearch distribution. This distribution includes a REST API for administering
a database, which can be accessed using <CODE>curl</CODE> commands or some other method
to send HTTP requests directly to the node.
</P>
<DIV class="sect3">
<DIV class="titlepage">
@ -485,10 +566,9 @@
class="emphasis"><EM>BSimElasticPlugin</EM></SPAN>, which unpacks into a standard
Ghidra installation. The file <SPAN class="emphasis"><EM>lsh.zip</EM></SPAN> is a
standard Elasticsearch plug-in that must be installed on every node of the cluster
before a BSim repository can be created. The Elasticsearch distribution typically comes
preconfigured for a single node deployment. The description below shows how to enable
BSim on such a toy deployment, but this will need to be extended to support an entire
cluster.</P>
before a BSim repository can be created. The description below shows how to enable
the BSim plug-in for a single node, but this will need to be repeated for any
additional nodes.</P>
<P>Assuming the add-on has been unpacked, the plug-in can be installed to a single node
using the <SPAN class="emphasis"><EM>elasticsearch-plugin</EM></SPAN> command in the
@ -521,6 +601,44 @@
up.</P>
</DIV>
<DIV class="sect3">
<DIV class="titlepage">
<DIV>
<DIV>
<H4 class="title"><A name="ElasticURL"></A>Elasticsearch Security</H4>
</DIV>
</DIV>
</DIV>
<P>The open Elasticsearch distribution starts with up with password authentication
enabled by default. When a node is started up for the first time, as described above, an
<SPAN class="bold"><STRONG>elastic</STRONG></SPAN> user is created with a randomly
generated password that is reported, once, to the console. For a toy deployment, it may
be convenient to add additional users via <CODE>curl</CODE> commands. The
following example creates a user named <SPAN class="bold"><STRONG>ghidrauser</STRONG></SPAN>
with a default password "changeme", using the <STRONG>elastic</STRONG> users credentials.
The generated password for the <STRONG>elastic</STRONG> user must be substituted for the XXXXXX
at the beginning of the command.
</P>
<DIV class="informalexample"><PRE class="programlisting">
curl -k -u elastic:XXXXXX -X POST "https://localhost:9200/_security/user/ghidrauser?pretty" -H 'Content-Type: application/json' -d'
{
"password" : "changeme",
"roles" : [ "viewer" ],
"full_name" : "Ghidra User",
"email" : "ghidrauser@example.com"
}
'
</PRE>
</DIV>
<P>Elasticsearch uses the concept of <EM>roles</EM> to grant access privileges to particular users. The
built-in role <STRONG>viewer</STRONG>, as in the example above, can be used to grant users
read-only access to a database. The built-in <STRONG>superuser</STRONG> role grants
administrator privileges.</P>
</DIV>
<DIV class="sect3">
<DIV class="titlepage">
<DIV>
@ -865,7 +983,7 @@
</DIV>
<P>It is possible to create tailored database configuration templates so that
implementors have a permanent and accessible record of a particular set-up and don't need
implementers have a permanent and accessible record of a particular set-up and don't need
to repeatedly issue <SPAN class="command"><STRONG>bsim setmetadata</STRONG></SPAN> and
<SPAN class="command"><STRONG>bsim addexecategory</STRONG></SPAN> when creating a
database. Other aspects of a database can also be manipulated, like weighting schemes and

View File

@ -36,8 +36,8 @@
</DIV>
</DIV>
<P>The BSim Database uses a standard <SPAN class="bold"><STRONG>Feature
Vector</STRONG></SPAN> approach to compare and index software functions. A <SPAN class=
<P>The BSim Database uses a <SPAN class="bold"><STRONG>feature vector</STRONG></SPAN>
approach to compare and index software functions. A <SPAN class=
"bold"><STRONG>feature</STRONG></SPAN> is an abstraction that simply means a single element
or attribute that can be compared quantitatively between two objects. The set of possible
features used by a particular approach is fixed, and any object being examined is viewed as
@ -250,7 +250,7 @@
to achieve a high confidence for a small function, for single matches viewed in
isolation. Of course a medium to low confidence threshold may be enough to produce a
unique match if the database is small, and a medium to high confidence threshold may
still produce occasional false positives if the database is very large.</P>
still produce occasional false positives even if the database is very large.</P>
</DIV>
</DIV>
</DIV>

View File

@ -110,8 +110,8 @@
<P>To generate features and metadata on an existing repository, use the <SPAN class=
"command"><STRONG>bsim generatesigs</STRONG></SPAN> command. Signatures may be written as
XML files to a local directory and/or comitted directly to a specified BSim database. If
not immediately comitting to a database and only storing the XML files an appropriate
XML files to a local directory and/or committed directly to a specified BSim database. If
not immediately committing to a database and only storing the XML files an appropriate
database <EM>config=</EM> may be specified in lieu of a BSim database URL
(<EM>bsimURL</EM>) if database specific executable categories and function tags are not
utilized. Use of the <EM>config=</EM> option does not require a running BSim server.</P>
@ -170,13 +170,16 @@
generated XML files by adding the explicit keyword <SPAN class=
"bold"><STRONG>--overwrite</STRONG></SPAN> as another parameter.</P>
<P>Both the Ghidra Server and the BSim server must be running in order for this command
to succeed, as the BSim server provides configuration information that may be relevant to
the signature generation process, such as database specific executable categories or
function tags. Assuming this is the same template used to create the database, the BSim
server no longer needs to be running. As in the example above, configuration information
is pulled from the BSim server and signatures are generated from the Ghidra Server
executables.</P>
<P>In general, both the Ghidra Server and the BSim server must be running in order for
<SPAN class="command"><STRONG>bsim generatesigs</STRONG></SPAN> command to succeed,
as the BSim server provides configuration information that may be relevant to
the signature generation process, such as database specific executable categories or
function tags. As in the example above, configuration information
is pulled from the BSim server and signatures are generated from the Ghidra Server
executables. If the <SPAN class="bold"><STRONG>config=</STRONG></SPAN>
option is used, assuming the template it specifies is the same one used to create the
database and there are no executable categories or function tags, the BSim server
does not need to be running.</P>
</DIV>
<DIV class="sect2">
@ -202,11 +205,11 @@
</DIV>
<P>This command takes XML signature files in <CODE class="filename">/xmldirectory</CODE>
and writes the metadata in them to a BSim database. All the executable, function, and
feature vector records are committed to their appropriate tables and all the indexing is
updated if supported. The URL refers to the BSim database rather than the Ghidra Server.
The URL cannot be extended with a path. Any executable paths are already encoded within
the XML file data.</P>
and writes the metadata in them to a BSim database, specified by URL. All the executable,
function, and feature vector records are committed to their appropriate tables and all
the indexing is updated if supported. The URL refers to a BSim database rather than a
Ghidra Server and cannot be extended with a path. Any executable paths are already
encoded within the XML file data.</P>
<P>Every executable described within the XML files has a <SPAN class=
"emphasis"><EM>repository</EM></SPAN> and <SPAN class="emphasis"><EM>path</EM></SPAN>
@ -495,15 +498,16 @@ public void adjustTags(Address myaddress) throws Exception {
</DIV>
</DIV>
<P>The BSim server currently has minimal maintenance functionality. Substantial changes
and additions may occur in the near term. It is possible currently to use the SQL command
line tool <SPAN class="command"><STRONG>psql</STRONG></SPAN> bundled with PostgreSQL in
order to make changes directly to the tables. But for very large modifications to the
database, the best option may be to recreate the database, which is slightly less onerous
than it sounds. The most CPU intensive part of the ingest process, Ghidra's
auto-analysis, typically does not need to be rerun across everything. Regenerating the
metadata files and reimporting takes much less time. Additional efficiency may be gained
by dropping and then regenerating the main index after (re)ingesting. (See below)</P>
<P>The <SPAN class="command"><STRONG>bsim</STRONG></SPAN> script provides a minimal number
of maintenance commands for a BSim server, described below. For a PostgreSQL server, it
is possible to use the bundled SQL command line tool
<SPAN class="command"><STRONG>psql</STRONG></SPAN> in order to make changes directly to
the tables. But for very large modifications to the database, the best option may be to
recreate the database, which is slightly less onerous than it sounds. The most CPU
intensive part of the ingest process, Ghidra's auto-analysis, typically does not need to
be rerun across everything. Regenerating the metadata files and reimporting takes much
less time. Additional efficiency may be gained by dropping and then regenerating the main
index after (re)ingesting. (See below)</P>
<DIV class="sect2">
<DIV class="titlepage">
@ -597,7 +601,7 @@ public void adjustTags(Address myaddress) throws Exception {
optional <SPAN class="bold"><STRONG>--overwrite</STRONG></SPAN> parameter, causing it
to overwrite any previously generated XML files. If a
<STRONG>bsim=<EM>bsimURL</EM></STRONG> is specified with the <STRONG>--commit</STRONG>
option updates will be commited directly to the database. A BSim database commit is
option updates will be committed directly to the database. A BSim database commit is
always performed using the specified <EM>bsimURL</EM> if an <EM>xmldirectory</EM> is
not specified.</P>
@ -620,7 +624,7 @@ public void adjustTags(Address myaddress) throws Exception {
</DIV>
</DIV>
<P><STRONG>NOTE:</STRONG> Applies to PostgrSQL or Elasticsearch databases only</P>
<P><STRONG>NOTE:</STRONG> Applies to PostgreSQL or Elasticsearch databases only</P>
<P>For those users performing large ingests or who find themselves rebuilding the
database frequently, it is possible to drop the main index, ingest data, then recreate
@ -649,8 +653,8 @@ public void adjustTags(Address myaddress) throws Exception {
<P>The time it takes to rebuild depends directly on the number of functions that have
been ingested. For very large collections, rebuilding can take hours or days. The
database can still be accessed while the index is dropped, but query times will likely
be too long for the database to be usable.</P>
database can still be accessed while the index is dropped, but queries may take
much longer to complete.</P>
</DIV>
<DIV class="sect2">
@ -662,7 +666,7 @@ public void adjustTags(Address myaddress) throws Exception {
</DIV>
</DIV>
<P><STRONG>NOTE:</STRONG> Applies to PostgrSQL databases only</P>
<P><STRONG>NOTE:</STRONG> Applies to PostgreSQL databases only</P>
<P>A maintainer can issue the <SPAN class="command"><STRONG>bsim
prewarm</STRONG></SPAN> command to prepopulate RAM with commonly accessed portions of a
@ -706,7 +710,7 @@ public void adjustTags(Address myaddress) throws Exception {
existing BSim database will be incompatible with both the client and server from a new
release.</P>
<P>Unfortunately, the only option to upgrade in these case is to reingest the executables
<P>Unfortunately, the only option to upgrade in these cases is to reingest the executables
into a new BSim database. Frequently the first two stages of ingest (See <A class="xref"
href="IngestProcess.html" title="Ingesting Executables"><I>Ingesting
Executables</I></A>), importing executables to a Ghidra Server and running auto-analysis,