Admin Manual

From Ephesoft Community
Jump to: navigation, search

Getting Started

To get started, make sure that Ephesoft Community Edition is installed on your system. If it is not installed, use the download page to download the installer. If you have trouble installing, refer to the Windows installation instructions or post a question on the forums.

After Ephesoft Community Edition is installed on your system, go to Start button--> All Programs--> Ephesoft, and you should see the following shotcuts:

  • Start Ephesoft Server: Right Click on this and select 'Run as administrator' to start the Java Application Server that runs Ephesoft software.
  • Ephesoft: This is a shortcut to the URL for the Admin Module. Alternatively, enter the following URL in your browser: http://localhost:8080/dcma/BatchClassManagement.html (NOTE: Only Firefox and Chrome are supported at this time).
  • Stop Ephesoft Server: Stops the Java Application Server that runs Ephesoft.
  • README: Contains information about the installed version of the software and contains instructions on how to use the software.


Batch Class List

The Batch Class List shows all current batch classes in the system. The sample batch classes currently included in the system are RecostarInvoice (for Enterprise Edition) and TesseractInvoice (Community Edition). The column headings are Identifier, Name, Description, Version, and Priority. The version number increases when changes are made to the module's configuration. Selecting the batch class and clicking the Edit button in the top right corner allows administrators to configure the batch class.

3.1 BatchClassList 1001.jpg

Batch Configuration

Clicking the Edit button takes you to the batch class configuration. It is possible to change the default priority based on document types or extraction. Priority levels are 1-25(Urgent), 26-50 (High), 51-75(Medium) and 76-100(Low). An admin can, for example, create document types and index fields and assign a new priority to the batch.


Administrators can view and edit the batch class configuration and view and individual plugins that make up the workflow.


Administrators can also view the individual documents of a particular batch class by clicking the Document Types tab.


Clicking the Edit button on the rightmost column of the list of modules takes you to the configuration screen for that particular plugin. The descriptions for the workflow plugins can be found in the software specification page. Each of the modules is configurable by the administrator. The individual plugins for each module can also be configured by the administrator

Ephesoft can also monitor Email inboxes for incoming mail and attachments. You only need to specify the following:

Username: Your email username

Password: Your email password

Server Name: Your incoming server name

Server Type: Type of incoming server, pop3/IMAP

Folder: Name of the folder you want monitored, for example "inbox"


Port Number: Incoming port number

3.1 EmailConfiguration 1001.jpg

More information about Email configuration can be found at How_To?#How_to_Configure_Email_Import

Plugin Configuration

Folder Import

Folder Import Plugin

Add file extensions to have the folder import plugin import different file types.


Multipage Files Plugin

Change settings for importing multipage files.


CMIS Import


From V3 onwards, Ephesoft now has the capability to receive files VIA CMIS as well as outputting the files via CMIS.

CMIS Import feature downloads files from a CMIS server and processes them as batches in Ephesoft. Using CMIS import, a user can monitor the CMIS server using a cron job which checks the specified folder for a new file after the specified interval of time. Along with the document, its properties are also downloaded in an xml format. Users can write their own custom scripts to access these properties in the batch being executed. A batch is created for every file downloaded.

Format for Downloaded XML





Admins can configure the CMIS server in the Batch Class

CMIS Import.png

  • Server URL: URL for making connection to CMIS server
  • Username: User name for authentication to the specified CMIS server.
  • Password: Password for authentication to the specified CMIS server.
  • Repository ID: CMIS server repository ID.
  • File Extension: Supported file extensions which will get downloaded. In version 3.0 , we are supporting only pdf and tif files.
  • Folder: Folder name on the CMIS server from where files need to be downloaded.
  • Property: This property is used to specify the cmis property which should be used to download file from CMIS server URL. Valid documents containing this property with the specified value mentioned below will be marked for selection.
    • e.g. cmis:name, cm:description, cm:title, cm:author
  • Value: This property contains the value for the property mentioned above. This key value pair decides which document will get downloaded.
  • New Value: This specifed the new value to be updated after downloading file from the cmis server of the specified cmis property. This is to ensure that same document doesn’t get downloaded again.

CMIS import config.png

URL for fetching repository information in alfresco: http://{server}:{port}/alfresco/service/cmis/index.html


CMIS Import feature downloads the file having valid file extension and having cmis property configured in the Property column which have the value mention in the Value column. After downloading the file from CMIS server our application updates that property value using new value configured in the New Value property.


Let’s take the example which will help us in understanding the property. CMIS server contains 15 documents but 10 of them are valid as per our confgured file extension. We have configured the property as cm:author and Value we have configured is “Ephesoft”, then only that document out of 10 documents which satisfy the cmis property “cm:author” and its value “Ephesoft” will be downloaded by the application and that document cmis property “cm:author” will be updated to New Value configured.

Cron Job Configuration

For cron job scheduler : Please update the following property file {Application}\WEB-INF\classes\META-INF\dcma-cmis-import\ for cron job. cmisImport.cronxpression=0 0/15 * ? * * Default value for this property is set to every 15 mins by default.

Disabling/Enabling CMIS Import Functionality

For enabling/disabling CMIS import functionality please uncomment/comment the following line at {Application}\applicationContext.xml <import resource="classpath:/META-INF/applicationContext-dcma-cmis-import.xml" /> Default: CMIS import is disabled.

Page Process

Create OCR Input Plugin


Tesseract Plugin

Add file extensions to have the tesseract plugin process different file types.


Recostar Plugin (Enterprise)

This plugin is available in Enterprise Edition only. Add file extensions to have the Recostar plugin process different file types. The property entitled Recostar Project File Name contains the information for the Recostar properties. Administrators can change the filename to use a custom Recostar configuration.


Create Display Image


Create Thumbnails


Classify Images


Barcode Reader Plugin

Administrators can configure the types of barcodes that are read by the system. The default values are CODE39, QR, and DATAMATRIX.


Search Classification Plugin


KV Page Process Plugin

3.1 KV PageProcess 1001.jpg

Document Assembly

See details here

Document Assembler

Administrators can customize the parameters by which documents are assembled.



Barcode Extraction Plugin


Recostar Extraction


Regular Expression Extraction Plugin


Key Value Extraction


Editing Overlays in Advanced KV Extraction

A new functionality to edit key and value overlays on the Advanced KV Extraction Screen has been added to the latest 3.0 version. Creating new overlays everytime is no more a restriction. User can edit an existing overlay by mouse left click and drag the mouse to extend the overlay.

Following is the snapshot of Advanced KV Screen:

3.1 BCM AdvKVExtraction 10001.jpg

Once the key has been captured using the Capture Key button, the Edit Key button gets enabled. Similarly, once the value has been captured using the Capture Value button, the Edit Value button gets enabled.

Once “Edit Key” or “Edit Value” has been clicked, all the other options become disabled on the screen.

While editing overlays for key and value, only one side of the rectangle forming the overlay becomes free for editing. Hence, there are four sides (of the rectangle) that can now be edited. To edit any side, the user now needs to click closest to that side and in the area formed by the parallel lines formed by extending its adjacent sides.

The following snapshots explain a use case where a user intends to edit the right hand side of the overlay formed for the key:

The following snapshot shows a captured Key and Value pair:

3.1 BCM AdvKVExtraction 10004.jpg

To edit the Key overlay the user will click on the “Edit Key” button and the screen will appear as shown in the following snapshot:

3.1 BCM AdvKVExtraction 10005.jpg
Click the center of the page to edit the right side of the Key Overlay
Fuzzy Database Plugin


Fuzzy Extraction

Introduction: Fuzzy Extraction earlier used to extract the Document level fields value learned through lucene indexes corresponding to the HOCR content of the batch.

The new enhancement allows the Fuzzy extraction to use the pre-configured document level field values specified by admin or values extracted by the Extraction plugin ran previously to Fuzzy Extraction Plugin.

For this enhancement two new plugin properties have been added to Fuzzy Db:

  • “Fuzzy Extraction Search Columns based on Fields”: this property defines the name of the Document Level Field for which the user wants to search. For e.g.
    • “$City,$State”: The values of the specified document level fields would be queried in the lucene content directly and appropriate row for database table is returned. Document level fields for the concerned document are populated accordingly.
  • “Fuzzy Extraction HOCR Switch”: ON signifies whether to continue searching with HOCR content in case the value specified in “Fuzzy Extraction Search Columns based on Fields” is not found. OFF signifies to search on the values extracted by previous extraction plugin in case the value specified in “Fuzzy Extraction Search Columns based on Fields” is not found.
Fuzzy 1.jpg

The following are cases that can occur during the Fuzzy Extraction:

“FuzzyDB Extraction switch” Value “Fuzzy Extraction Search Column” Value “Fuzzy Extraction HOCR Switch” Value Result
No Fuzzy Extraction. Fuzzy Extraction would be turned off.
ON <Empty> N.A
Usual Fuzzy Extraction using HOCR content.
ON “$City,$State” OFF
In case value of City or State is present in the batch xml then content is searched in the learned lucene content and appropriate database row is returned.


In case value of City and State is not found in the batch xml then extracted values from previous extraction plugin are retained.
ON “$City,$State” ON
In case value of City or State is present in the batch xml then content is searched in the learned lucene content and appropriate database row is returned.


In case value of City and State is not found in the batch xml then Fuzzy Search based on HOCR content is performed.
Table Extraction



See details here.


See details here.


Create Multipage Files

Tell Ephesoft whether or not to export multipage files and change the settings for doing so.

3.1 CreateMultipageFilesPlugin 10001.jpg

CSV File Creation Plugin


Tabbed PDF Plugin


IBM CM Plugin


Copy Batch XML Plugin

3.1 CopyBatchxmlplugin 10001.jpg

CMIS Export Plugin

More information about CMIS settings is available at How_To?#Configure_CMIS


Filebound Export Plugin


NSI Export Plugin


Key Value Learning Plugin


Cleanup Plugin


DB Export plugin

This plugin is responsible for saving the document level fields for a particular batch instance to the database.

3.1 BCM DB EXPORT 10001.jpg

UI Configuration:

Plugin properties:

  • “Database Export Switch”: The switch that defines whether this plugin is switched ON or OFF.
  • “Database Connection URL”: The database connection URL corresponding to the selected driver.
  • “Database Driver”: Type of driver to be used for database connection.
  • “Database User Name”: database username.
  • “Database Password”: database password.

Mapping File:

  • Mapping file for this plugin is stored for each batch class at the following path:
  • <SHARED-FOLDERS>\<BATCH_CLASS_IDENTIFIER>\ db-export-plugin-mapping\

Mappings should be provided in the following syntax: `===DB Export mapping syntax for exporting DLF attribute:===


DB Export mapping syntax for exporting table:


Optional Parameters:

Optional Parameter Export Syntax



Extracts the Batch-class Level Fields.


Extracts the document type ID.


Extracts the document type name.


Extracts the Batch Class ID.


Extracts the Batch Instance ID.