After Scanning

Configuring ”After Scanning”

You can configure PixEdit to execute almost any operation sequence automatically after you have scanned a batch of documents. If you are using DocServer in PixEdit, the after scan processing will be executed while you are scanning the next batch of documents.

Independent of scanner type and brand, PixEdit offers automatic document separation using barcodes, quality enhancements, page size detection, automatic orientation, automatic deskew and blank page removal, automatic color and resolution optimization (ACRO), OCR using the OCR snap-in module so your saved PDF’s becomes searchable, automatic saving to any supported file format using barcodes, incrementing file names or time/date, execution of custom made macros and much more.


Although you will find a lot of useful functions for after-scan processing, your business may have very special requirements that do not exist in the After Scanning dialog. As an example, let’s say you want to always join three and three pages automatically after scanning (this is a quite uncommon occurrence, so this serves as a good example in this case). You can do this with the help of a macro. Simply open a document, click the macro record button and join three and three pages and specify “All pages” in the join dialog. Stop recording and give your newly recorded macro a name, for example, “My special joining”. Now click the Configure button in the After Scanning section of ScanBar, and under the General tab, select your recorded macro from the drop-down list. Three and three pages will now always be joined after completed scanning. Remember to check the Enable option in ScanBar to enable after-scan processing.

You may create many different after-scan profiles, and then load them depending on your various scanning tasks during a production day. Select the profile to be used in the drop-down list in ScanBar. You can create and manage profiles in the Profile tab in the After Scanning dialog.

To configure the automatic after scanning process, click the Configure button just below the big green button in ScanBar. If ScanBar is not visible, press Shift-F10. The dialog box for configuring After Scanning is divided into several tabs. These tabs are: General processing, Job Separation, Forms Processing and Saving. You will also find a separate tab for managing after scanning profiles.

If you always use the same configuration, you don’t need to use profiles. However, if you would like to be able to switch rapidly between different types of processing after scanning, using profiles is a good practice.

Note: you don’t have to wait until the after scanning process has finished before scanning the next batch. You can leave the processing to DocServer by checking the “Processed scanned document with DocServer” option. DocServer will then process your batch in background while you continue to scan the next batch. If you would like to inspect the automated process, uncheck this option. When using DocServer or Continuous Network scanning, any errors during processing will not halt processing. Instead, batches that fail will be stored in a folder called “not_processed” in the original source folder so that the automated after scan processing can continue.

The General tab

The general tab contains settings for all types of general processing such as deskew, removal of blank pages, image cleanups, execution of macros, OCR (option), automatic page size, automatic orientation and much more. The following options are available:

Enable after scan processing Check this option to enable the automatic after scan processing

Automatic color and resolution optimization (ACRO) Use this option to let PixEdit automatically decide which pages in the document that should be stored in color, in greyscale or in black/ white. In addition, you can configure individual resolution parameters for the different detected page types. You must configure ScanBar to scan in full color in order to utilize ACRO.

Enhance contrast Some scanners may deliver pages with low contrast. To make text darker and grey backgrounds brighter, check this option. For normal documents, choose a value between one and four percent. Bright areas will be brighter and dark areas will be darker within the limits of maximum contrast. Some rare pages may have general low contrast, but may still also contain areas with maximum and minimum brightness. In such cases you may want to adjust the clip value even higher. Contrast enhancement functions only on color or grey scale pages, and is designed for optimum performance on scanned letters, invoices and similar materials.

Toning Toning is an advanced form of contrast enhancement with more adjustment possibilities. For example, you can choose to only enhance the contrast in the dark or bright areas. Toning is designed for use by scanner specialists and support personnel.

Deskew automatically The automatic deskew function examines the document and corrects any skew introduced by the scanning process. PixEdit will look for text lines or nearly horizontal graphics and then deskew each page accordingly.

Remove blank pages This function automatically removes blank pages. In some cases, blank pages may contain a small amount of graphics because of dark scanner settings or spots on the original document. You may therefore want to adjust the default 0.06% value of acceptable noise level up to 0.1 %. Some blank pages may also contain some extra graphics on the edges. For this reason, PixEdit may be configured to ignore a specified area along the edges before analyzing. If you get blank pages in your documents after having used this function, you should always try to increase the margin values before increasing the noise level value.

Remove black borders Some scanners can deliver a black border around each scanned page. PixEdit can use this information to automatically crop the page to the original paper size, making it possible to scan different pages sizes in one single batch. You can also configure PixEdit to simply remove the black border without cropping by unchecking “Detect page size and crop automatically”.

Automatically crop half size pages If your scanner cannot deliver black borders around pages, you can still scan a mix of A and B-sized pages and crop automatically by enabling this option. This method may not give as accurate results as the Remove black borders option.

Process using the following macro Although you will find a lot of useful functions for after-scan processing, your business may have very special requirements that do not exist under the General tab. As an example, let’s say you want to always join three and three pages automatically after scanning (this is a quite uncommon occurrence, so this serves as a good example in this case). You can do this with the help of a macro. Simply open a document, click the macro record button and join three and three pages and specify “All pages” in the join dialog. Stop recording and give your newly recorded macro a name, for example, “My special joining”. Now click the Configure button in the After Scanning section of ScanBar, and under the General tab, select your recorded macro from the drop-down list. Three and three pages will now always be joined after completed scanning.

Recognize text using OCR If you want to produce searchable files, check this option. Just note that you also must specify PDF as file format under the Saving tab. You must have PixEdit with the OCR option to create searchable PDF files.

Automatic page orientation Pages may sometimes be scanned with wrong orientation by mistake. Use this option to automatically correct such pages. Automatic page orientation is only available if you have PixEdit with the OCR option.

Open and view the saved documents (inspection) To inspect each scanned page after saving, check this option. Each page will be displayed a configured amount of time. If you have completed your inspection before the specified amount of time, hit space bar on your keyboard to skip to the next page.

Show me the saved files in the thumbnail bar The thumbnail bar will be displayed, showing the automatically saved files as small thumbnails directly after scanning. Holding the cursor over a thumbnail will display a larger view. To open the file for close inspection, double click the thumbnail. You may also drag a thumbnail into the page composition window at any desired place in your current open document.

Delete source files from network scanner When clicking the green scan button or by double clicking a thumbnail in network scanning mode, the file will always be removed from the network scanning source folder. However, a copy of the original unprocessed file will always be stored in a “processed” folder created in the source folder unless you check this option. Keeping these copies in the “Processed” folder may be useful in case something goes wrong in the after scan processing, or if the specified saving location is temporary unavailable. If you keep this option unchecked, you will not need to rescan your pages if something goes wrong. Instead, you can just re-process the batch of failed documents using the same network scanning profile, or drag them into a suitable profile in DocServer for background batch processing.

Report processing results in a log file Choose this option to report results of the after scanning process to a log file. This option is useful for finding possible problems when designing complex production environments.

The Separation tab

Separation is an important part of production scanning, as it is normally more efficient to fill the entire document tray with a batch of documents and let PixEdit separate each document in the after scanning process. This is especially true if each document contains few pages.

To indicate the beginning of a new document in a batch, you can use barcode stickers on the first page of each document or barcode separation sheets between documents. The content of the barcode normally indicates the document file name. You may use the same barcode sticker or separation sheet. In that case PixEdit will add an incrementing number to each file. If each separation indicator contains several barcodes or stickers, PixEdit will combine these to create the file name. The barcode content may not necessarily contain the entire file name, as you can combine the original file name (if you use network scanning) with the file name extracted from the barcode.

PixEdit is compatible with many document management systems, and will automatically detect separation sheets from these systems and act according to their specifications. For more information about specifications, contact the producer of you document management system. You may also use Tools, Create Barcode Separation Sheet in PixEdit to make separation sheets. These can be used as standard document separators, in addition to be used to automatically send the document by e-mail.

If you are creating your own barcodes stickers, make sure that you are using a large enough font so that they can be read easily by your scanner. Using a font size with two characters per cm (five characters per inch) should normally give you a good safety margin at 300 DPI when using barcode Type 39. Make sure you have a 3 mm (1/8 inch) white border around the barcode itself.

Barcode stickers may be placed anywhere on a page, but must attached nearly horizontally or vertically (if specified) on the paper. It does not matter if you place the sticker upside down.

PixEdit will ignore barcodes with a hand-drawn line across the sticker. Use this method if your document contains pages with barcodes that you would like PixEdit to ignore.

Separation method – Separate all pages Choose this option to separate each page in the batch as a single page document. This is useful if you are only scanning single page documents. Naturally, there is no need to use separation sheets when separating all pages.

Separation method – Automatic document separation using separation sheets Choose this option if you are using barcode stickers or separate separation sheets. The barcode type should be Type 39. The document is detached from the batch and given the name specified in the barcode. If more than one barcode is found, the name will be a combination of the detected barcodes. The document is just prepared with the name, but not saved. Details about file format, compression methods, naming conventions and so on should be defined under the Saving tab.

Change document orientation by rotating You can use this option to turn all pages, or every second page in any direction you prefer. This option is useful when you scan portrait batches in landscape to shorten scanning time. Some scanners deliver every second page turned ±90 degrees when you scan landscape documents. To correct such situations, enable this option and choose the turn method in the drop down list.

By default, PixEdit looks for horizontal barcodes only. The reason is that printed letters often already contain vertical barcodes on the left hand side. Vertical barcodes are ignored. However, by using the rotation option, you can let PixEdit look for vertical, instead of horizontal barcodes. If you use this method, you must counter rotate the document back to normal orientation using a PixEdit macro under the general tab.

Number of barcodes on separation sheet It is a good practice to specify how many barcodes you are using as separation indicators to increase production reliability. PixEdit will issue an error message if a different number of barcodes than specified are detected. By checking Ignore pages with a non matching barcode count, PixEdit will ignore all barcodes and treat these like ordinary pages and ignore all barcodes. If you are using the special Techsoft barcode separation sheet (Tools, Create Barcode Separation Sheet), no specification of barcodes is necessary.

Search for barcodes on the upper By default, PixEdit will look for barcodes on the entire page. Using this option, you may instruct PixEdit to look for barcodes only on, for example, the upper 20% of each page.

Threshold for grayscale/color conversion Normally you don’t need to change the default value of 128. If you experience errors during barcode detection, you may need to adjust this value. Beware that an incorrect value may result in lost barcodes. This setting does not affect or alter the page, as it is only used internally by PixEdit during barcode detection. This setting should only be altered by service personnel or system administrators.

Ignore pages with any barcodes less than In rare cases, a page may contain graphics looking like barcodes. If this option is not checked, PixEdit will issue an error message when such graphics are detected. Normally, a barcode lookalike will be decoded to contain between two and four characters. If the number of characters in genuine barcodes contains several more characters than this, it is good practice to set this value to a low number to prevent unnecessary error messages.

Ignore barcodes that does not contain the following characters Your documents may already contain barcodes that you would like to ignore during production. If your own barcodes always contains, for example, the text “ab”, you can specify that any barcode that does not contain “ab” should be ignored. Optionally you can let PixEdit remove the specified text from the final file name by checking Remove the specified characters from the generated filename.

The barcodes must contain If your barcodes always contain a specific number of characters, you may specify this number here. PixEdit will issue an error message if detected barcodes does not contain the specified number of characters.

Barcodes are numeric only Check this option to let PixEdit warn you if a non-numeric barcode is detected.

Verify start and stop characters in barcodes Some barcode label writers add start and stop characters (“*”) to barcodes. Choose this option to let PixEdit issue an error message if start or stop characters is missing from a detected barcode.

Always add an incrementing number to the end of the file name Use this option to add an incrementing file number to separated documents

Include barcode separation sheet in separated document This option should be checked if you are using barcode stickers attached to the first page in each document. If you are using separate separator sheets however, uncheck this option.

Remove any blank page after the separation sheet If you are using separate separation sheets, check this option to remove the blank flipside. You don’t need to use this option if you already use the Remove blank pages option in the General tab.

The Forms tab (PixEdit with OCR only)

The Forms tab lets you extract information from scanned forms directly after scanning, with or without saving the raster forms. Regardless if your task is to extract form data automatically after scanning a batch, or by using the Batch Wizard you will first need to define a form original. To learn how to create a form original, read Defining and creating forms.

If you use the content from your forms to create file names for raster files (if enabled), any naming convention you specify in the Saving tab will be ignored.

If you often process new types of forms, it may be easier to do forms processing using the Batch Wizard instead of processing them using the Forms tab in After Scanning. The reason is that an untested, newly defined form may contain errors, and you will then have to rescan the batch instead of just re-process a saved batch using the Batch Wizard. It is a good idea to first learn forms processing using the Batch Wizard before using this functionality in After Scanning.

Forms processing Check this option to enable forms processing in After Scanning. You cannot use general separation using separator sheets or barcode stickers at the same time as you use forms processing. The reason is that the defined form original defines how many pages there are in each form.

If your only task is to separate the forms from the batch into separate document files without extracting data, you can just define an empty form consisting of the number of pages you would like to have in each separated document. The most common use however, is of course to extract data such as text, check marks, numbers and so on.

Current selected form for processing This is where you specify which form original to use during data extraction.

Save extracted form data You may want to uncheck this option when testing a new form original to avoid producing unnecessary data files.

Forms data layout Use this option to specify how you want to store your extracted data. Choose between saving each form in separate files, each batch in separate files, or the entire project in a separate file.

Forms data format Extracted data can be saved in formatted text or XML. Each field will be separated with Tab and each form with a newline character if you choose formatted text.

Save extracted form data in the following directory Type the path or use the Browse button to specify where to save extracted data.

Use content from the forms title item as filename(s) Title items are used for creating file names from extracted fields in forms. If you have defined title items in your original form, this option can be used to create file names based on the content from these fields. Title items can be extracted from numbers, text and barcodes. This option overrides any naming rules specified in the Saving tab.

Convert true color pages to monochrome during processing If you choose to save not only the extracted forms data, but also the original raster forms, you may want to convert any raster forms scanned in full color to pure black/white forms to save storage space. Click the Options button to choose a suitable conversion method.

Error handling Choosing between correcting errors during or after processing. If you choose to correct errors during processing, the extraction process will halt while you correct errors. For larger jobs you may prefer to let the process run unattended and correct errors after processing.

The Saving tab

The saving tab specifies where the current profile should save produced documents, as well as the file format and compression type. Any specification in the saving tab may be overridden by content found in barcodes if specified in the Separation tab. You may also close the generated file name and create unique folders with the same file name as the document.

PixEdit is capable of saving your documents in many different file formats and compression types. The most popular choice is PDF and TIFF. PDF is the only format that offers to store OCR information, so if you need to make your documents searchable, choose PDF as saving format.

If the separation tab specifies that file names should be extracted from barcode stickers or separate separation sheets, or if the Forms processing tab (OCR version only) specifies that file names be extracted from defined item fields, any file name specification in the Saving tab will be overridden. The saving tab offers several ways to automatically name your files, such as date/time, name incrementation and so on.

Save the scanned document as follows: You don’t need to save your scanned document automatically after scanning. If you choose to not save document, each document will be prepared with a file name according to your specification but not physically saved on disk. The document will remain open in PixEdit under a tab in the user interface. If you choose not to save your documents automatically, PixEdit will after a short while have many documents open at the same time. This will affect the performance of the after scan processing, since having many open documents require substantial memory resources from the computer. You will need to have several gigabytes RAM installed in your PC if you need to have many documents open at the same time.

Save in the following directory This choice defines the main directory to be used for automatic saving of your documents. By using the additional option Create sub folder based on the file name, an additional sub-folder will be created.

Save as type Specifies file type to be used. Choose between PDF Raster, PDF/A-1b Raster (ISO standard), PDF/A-1b Compact (ISO Standard), TIFF and TDF. Unless you have compatibility issues with your document management system, it’s recommended to use either PDF/A-1b Raster or PDF/A-1b Compact since they conform to the ISO standard.

PDF/A files can be stored using several different compression methods. The most commonly used compression type is JPG for color pages and CCITT Gr.IV or JBIG for black and white pages, due to its high compression rate. Now – the ISO standard states that “JPEG is not recommended” in PDF/A files. The main reason for this statement is probably that JPG is a destructive compression method, since each time a page with JPG compression is saved, the quality is slightly degraded. However, the ISO standard does not forbid using JPG as compression method. Before using JPG as compression method in your PDF files, you should therefore check company policies. If you want to use a non-destructive compression method, chose Deflate.

To keep the file size small while still conforming to ISO PDF/A, choose the PDF/A-b Compact file format. Important: Make sure you have set up ScanBar to 300 DPI, full color, and that you have chosen ACRO under the General tab when using this very compact file format. The files will typically be 5-8 times smaller than files with the JPG compression method. PDF/A-b Compact is suitable for ordinary post and general documents, but less suitable for technical drawings. Compact compression is not suitable for large format documents.

Filename generation PixEdit offers several methods for automatic filename generation. If you have specified generation of filenames from barcodes or from item fields, any filename generation in this tab will be ignored.

Choose between date/time, incrementing filename, incrementing file extension and original filename. If you have a long production line with several PixEdit licenses in series using network scanning mode, choose original filename to keep the same name though out the entire production line.

Print status sheet on default printer In larger production lines, the number of scanned batches are high. It is good practice to let PixEdit automatically print out a status sheet as soon as a batch has been scanned, and then put this page on top of the scanned batch. The status sheet contains relevant information about the scanned batch, such as the number of pages, the number of separated documents, date and time scanned and so on. Some users prefer to use colored sheets in the printer so that they can be found and seen easily.

Append document title to original filename Check this option if you would like the final filename to be a combination of the original filename and the one generated. Some production lines are using this option to track the machine and scanner name(s) used for a particular document.

Close document after saving Unless you need to keep scanned documents open after completed processing, check this option. Keeping many documents open at the same time may reduce processing performance, unless you have several gigabytes of RAM installed in your computer.

The Profile tab

Setting up an after scan process often involves configuring all tabs in the After Scanning dialog box. If you quickly need to switch between two or more production configurations, you should consider storing your different configurations in after scanning profiles, rather than reconfiguring current profile. You activate a profile by selecting it in ScanBar.

Profiles also enable you to run several types of production types at the same time by using DocServer.

Notice that when you select a different scanning profile (in the upper part of ScanBar), the corresponding after scanning profile will be activated. In other words, the after scanning profile “follows” the scanning profile.

All changes you make in the current after scanning profile will be stored in this profile. To make a new profile based on current profile, click New and give the new profile a suitable name. All subsequent changes will be saved in this profile. To delete current profile, click Delete.

Comments

Leave a Reply

You must be logged in to post a comment.