Admin Manual
From Ephesoft Community
Getting Started
To get started, make sure that Ephesoft Community Edition is installed on your system. If it is not installed, use the download page to download the installer. If you have trouble installing, refer to the Windows installation instructions or post a question on the forums.
After Ephesoft Community Edition is installed on your system, go to Start button--> All Programs--> Ephesoft, and you should see the following shotcuts:
- Start Ephesoft Server: Right Click on this and select 'Run as administrator' to start the Java Application Server that runs Ephesoft software.
- Ephesoft: This is a shortcut to the URL for the Admin Module. Alternatively, enter the following URL in your browser: http://localhost:8080/dcma/BatchClassManagement.html (NOTE: Only Firefox and Chrome are supported at this time).
- Stop Ephesoft Server: Stops the Java Application Server that runs Ephesoft.
- README: Contains information about the installed version of the software and contains instructions on how to use the software.
Configuration
Batch Class List
The Batch Class List shows all current batch classes in the system. The sample batch classes currently included in the system are RecostarInvoice (for Enterprise Edition) and TesseractInvoice (Community Edition). The column headings are Identifier, Name, Description, Version, and Priority. The version number increases when changes are made to the module's configuration. Selecting the batch class and clicking the Edit button in the top right corner allows administrators to configure the batch class.
Batch Configuration
Clicking the Edit button takes you to the batch class configuration. It is possible to change the default priority based on document types or extraction. Priority levels are 1-25(Urgent), 26-50 (High), 51-75(Medium) and 76-100(Low). An admin can, for example, create document types and index fields and assign a new priority to the batch.
Administrators can view and edit the batch class configuration and view and individual plugins that make up the workflow.
Administrators can also view the individual documents of a particular batch class by clicking the Document Types tab.
Clicking the Edit button on the rightmost column of the list of modules takes you to the configuration screen for that particular plugin. The descriptions for the workflow plugins can be found in the software specification page. Each of the modules is configurable by the administrator. The individual plugins for each module can also be configured by the administrator
Ephesoft can also monitor Email inboxes for incoming mail and attachments. You only need to specify the following:
Username: Your email username
Password: Your email password
Server Name: Your incoming server name
Server Type: Type of incoming server, pop3/IMAP
Folder: Name of the folder you want monitored, for example "inbox"
SSL
Port Number: Incoming port number
More information about Email configuration can be found at How_To?#How_to_Configure_Email_Import
Plugin Configuration
Folder Import
Folder Import Plugin
Add file extensions to have the folder import plugin import different file types.
Multipage Files Plugin
Change settings for importing multipage files.
CMIS Import
Introduction
From V3 onwards, Ephesoft now has the capability to receive files VIA CMIS as well as outputting the files via CMIS.
CMIS Import feature downloads files from a CMIS server and processes them as batches in Ephesoft. Using CMIS import, a user can monitor the CMIS server using a cron job which checks the specified folder for a new file after the specified interval of time. Along with the document, its properties are also downloaded in an xml format. Users can write their own custom scripts to access these properties in the batch being executed. A batch is created for every file downloaded.
Format for Downloaded XML
<CmisImport>
- <Properties>
- <Property>
- <Name>Description</Name>
- <Value/>
- </Property>
- <Property>
- <Name>Title</Name>
- <Value>BI1E_documentDOC2.pdf</Value>
- </Property>
- ...
- </Properties>
- <Property>
</CmisImport>
Configuration
Admins can configure the CMIS server in the Batch Class
- Server URL: URL for making connection to CMIS server
- Username: User name for authentication to the specified CMIS server.
- Password: Password for authentication to the specified CMIS server.
- Repository ID: CMIS server repository ID.
- File Extension: Supported file extensions which will get downloaded. In version 3.0 , we are supporting only pdf and tif files.
- Folder: Folder name on the CMIS server from where files need to be downloaded.
- Property: This property is used to specify the cmis property which should be used to download file from CMIS server URL. Valid documents containing this property with the specified value mentioned below will be marked for selection.
- e.g. cmis:name, cm:description, cm:title, cm:author
- Value: This property contains the value for the property mentioned above. This key value pair decides which document will get downloaded.
- New Value: This specifed the new value to be updated after downloading file from the cmis server of the specified cmis property. This is to ensure that same document doesn’t get downloaded again.
URL for fetching repository information in alfresco: http://{server}:{port}/alfresco/service/cmis/index.html
Summary
CMIS Import feature downloads the file having valid file extension and having cmis property configured in the Property column which have the value mention in the Value column. After downloading the file from CMIS server our application updates that property value using new value configured in the New Value property.
Example
Let’s take the example which will help us in understanding the property. CMIS server contains 15 documents but 10 of them are valid as per our confgured file extension. We have configured the property as cm:author and Value we have configured is “Ephesoft”, then only that document out of 10 documents which satisfy the cmis property “cm:author” and its value “Ephesoft” will be downloaded by the application and that document cmis property “cm:author” will be updated to New Value configured.
Cron Job Configuratuib:
For cron job scheduler : Please update the following property file {Application}\WEB-INF\classes\META-INF\dcma-cmis-import\cmis-import.properties for cron job. cmisImport.cronxpression=0 0/15 * ? * * Default value for this property is set to every 15 mins by default.
Disabling/Enabling CMIS Import Functionality:
For enabling/disabling CMIS import functionality please uncomment/comment the following line at {Application}\applicationContext.xml <import resource="classpath:/META-INF/applicationContext-dcma-cmis-import.xml" /> Default: CMIS import is disabled.
Page Process
Create OCR Input Plugin
Tesseract Plugin
Add file extensions to have the tesseract plugin process different file types.
Recostar Plugin (Enterprise)
This plugin is available in Enterprise Edition only. Add file extensions to have the Recostar plugin process different file types. The property entitled Recostar Project File Name contains the information for the Recostar properties. Administrators can change the filename to use a custom Recostar configuration.
Create Display Image
Create Thumbnails
Classify Images
Barcode Reader Plugin
Administrators can configure the types of barcodes that are read by the system. The default values are CODE39, QR, and DATAMATRIX.
Search Classification Plugin
KV Page Process Plugin
Document Assembly
See details here
Document Assembler
Administrators can customize the parameters by which documents are assembled.
Extraction
Barcode Extraction Plugin
Recostar Extraction
Regular Expression Extraction Plugin
Key Value Extraction
Editing Overlays in Advanced KV Extraction
A new functionality to edit key and value overlays on the Advanced KV Extraction Screen has been added to the latest 3.0 version. Creating new overlays everytime is no more a restriction. User can edit an existing overlay by mouse left click and drag the mouse to extend the overlay.
The following snapshot of the Advanced KV Screen:
Once the key has been captured using the Capture Key button, the Edit Key button gets enabled. Similarly, once the value has been captured using the Capture Value button, the Edit Value button gets enabled.
Once “Edit Key” or “Edit Value” has been clicked, all the other options become disabled on the screen.
While editing overlays for key and value, only one side of the rectangle forming the overlay becomes free for editing. Hence, there are four sides (of the rectangle) that can now be edited. To edit any side, the user now needs to click closest to that side and in the area formed by the parallel lines formed by extending its adjacent sides.
The following snapshots explain a use case where a user intends to edit the right hand side of the overlay formed for the key:
The following snapshot shows a captured Key and Value pair:
To edit the Key overlay the user will click on the “Edit Key” button and the screen will appear as shown in the following snapshot:
Fuzzy Database Plugin
Fuzzy Extraction
Introduction: Fuzzy Extraction earlier used to extract the Document level fields value learned through lucene indexes corresponding to the HOCR content of the batch.
The new enhancement allows the Fuzzy extraction to use the pre-configured document level field values specified by admin or values extracted by the Extraction plugin ran previously to Fuzzy Extraction Plugin.
For this enhancement two new plugin properties have been added to Fuzzy Db:
- “Fuzzy Extraction Search Columns based on Fields”: this property defines the name of the Document Level Field for which the user wants to search. For e.g.
- “$City,$State”: The values of the specified document level fields would be queried in the lucene content directly and appropriate row for database table is returned. Document level fields for the concerned document are populated accordingly.
- “Fuzzy Extraction HOCR Switch”: ON signifies whether to continue searching with HOCR content in case the value specified in “Fuzzy Extraction Search Columns based on Fields” is not found. OFF signifies to search on the values extracted by previous extraction plugin in case the value specified in “Fuzzy Extraction Search Columns based on Fields” is not found.
- “Fuzzy Extraction Search Columns based on Fields”: this property defines the name of the Document Level Field for which the user wants to search. For e.g.

The following are cases that can occur during the Fuzzy Extraction:
| “FuzzyDB Extraction switch” Value | “Fuzzy Extraction Search Column” Value | “Fuzzy Extraction HOCR Switch” Value | Result |
|---|---|---|---|
| OFF | N.A | N.A | |
| ON | <Empty> | N.A | |
| ON | “$City,$State” | OFF | Else
|
| ON | “$City,$State” | ON | Else
|
Table Extraction
Review
See details here.
Validation
See details here.
Export
Create Multipage Files
Tell Ephesoft whether or not to export multipage files and change the settings for doing so.
CSV File Creation Plugin
Tabbed PDF Plugin
IBM CM Plugin
Copy Batch XML Plugin
CMIS Export Plugin
More information about CMIS settings is available at How_To?#Configure_CMIS
Filebound Export Plugin
NSI Export Plugin
Key Value Learning Plugin
Cleanup Plugin
DB Export plugin
This plugin is responsible for saving the document level fields for a particular batch instance to the database.
UI Configuration:
Plugin properties:
- “Database Export Switch”: The switch that defines whether this plugin is switched ON or OFF.
- “Database Connection URL”: The database connection URL corresponding to the selected driver.
- “Database Driver”: Type of driver to be used for database connection.
- “Database User Name”: database username.
- “Database Password”: database password.
Mapping File:
- Mapping file for this plugin is stored for each batch class at the following path:
- <SHARED-FOLDERS>\<BATCH_CLASS_IDENTIFIER>\ db-export-plugin-mapping\ db-export-mapping.properties
Mappings should be provided in the following syntax: <Document Type>.<Document Level Field Name>=<Database Table Name>:<Database Table Column Name>
For e.g.:
- Invoice.type=testTable:invoiceType
- Invoice.sender=testTable:invoiceSender
- Invoice.receiver=testTable:invoiceReceiver
- Invoice.total=testTable:invoiceTotal
Database table structure:
The database table for providing the mappings should conform to the following below standard. However first 4 fields i.e. BATCH_INSTANCE_ID, BATCH_CLASS_ID, DOCUMENT_TYPE and DOCUMENT_LEVEL_FIELD_NAME should exist while rest of the columns can vary. Instead of specifying VALUE field, user can specify multiple custom fields each one for a Document Level Field.
| Field Name | Null allowed | Mandatory |
BATCH_INSTANCE_ID NO YES BATCH_CLASS_ID NO YES DOCUMENT_TYPE NO YES DOCUMENT_LEVEL_FIELD_NAME NO YES VALUE YES NO
Consider the mapping: Invoice.type=exportTable:invoiceType Invoice.sender= exportTable:invoiceSender Invoice.total= exportTable:invoiceTotal
For the above mapping file, table structure should be: CREATE TABLE `exportTable` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `BATCH_INSTANCE_ID` VARCHAR(255) NOT NULL, `BATCH_CLASS_ID` VARCHAR(255) NOT NULL, `DOCUMENT_TYPE` VARCHAR(255) NOT NULL, `DOCUMENT_LEVEL_FIELD_NAME` VARCHAR(255) NOT NULL, `invoiceType` VARCHAR(255) NULL DEFAULT NULL, `invoiceSender` VARCHAR(255) NULL DEFAULT NULL, `invoiceTotal` INT(11) NULL DEFAULT NULL, PRIMARY KEY (`id`) ) The new issue reported by client is : “I can query the policy# in SQL, but nothing in Ephesoft.”
Since the user can search SQL, but not via Ephesoft application then following could be the cases for this issue: • Learnt indexes are not in sync with database values. o Try to Re-learn the database by clicking on the “Learn-Db” button on batch class management screen for the concerned batch class. And attempt to search against that policy number now. • The property “fuzzydb.ignore_list” in the property file “<Ephesoft-installation-folder>\Application\WEB-INF\classes\META-INF\dcma-fuzzydb\fuzzy-db.properties” is set to such values that all search results are being ignored. This property has “;” separated values. For e.g. if fuzzydb.ignore_list=APPLICATION;INVOICE;CA Then all the results with any mapped value equal to “APPLICATION” or “INVOICE” or “CA” will be ignored. Please make sure that is not the case.




























