Data From PDF

On This Page

Fixed Area

Table Area


Import from PDF function is used to get data from PDF.

SheetKraft contains the Data From PDF option which can be accessed by the Import From Button present in the SheetKraft Toolbar ( See figure) Ribbon

The required data can be imported from the PDF.

The following guidelines describe how data can be imported from a particular PDF using Data From PDF. Data from PDF consists two mode of operation Fixed Area and Table Area.

Fixed Area : The data which isn't in a tabluar format or is static data can be extracted by Fixed Area.

Table Area : Data in tabular format and which can be increased vertically is extracted by Table Area.

Fixed Area

Step 1

Click on the Import From Button and select the PDF option to open this dialogue box.

Data

Step 2

Click on the New Config button next to Config Folder, to setup pdf configuration.Below dialogue box will appear.

Data

Select the path where you want the config folder to be saved.

Data

Step 3

After selecting the Config folder path, select the pdf file from which data needs to be extracted. You will see window like this.

Data

Where you will see the config folder path and the sample file name.If the PDF has password then mention it in the password box. Then there is '+' button next to Add Test Files, where you can add multiple files similar to the sample file format. DPI (dots per inch), here you can adjust the resolution of PDF.

Step 4

Once the PDF is loaded click on the the Fixed Area operation. Fixed Area Tab will expand(see the picture below).

Data

Fixed Area has few parameters which needs to be set-in for the data to be extracted.

Fixed Area Parameter's :

Parameter Description
Search Area Select the area from which Data needs to be Extracted (i.e. Select the X co-ordinate,Y co-ordinate,Height,Width). Also Mention the page number on which the setup is done in the Page Text box
Masks Mask the value which isn't constant or the data which needs to be extracted.Select the X co-ordinate,Y co-ordinate,Height,Width There can be multiple Masks.
Title Header Size Select the Title header from the First page of the PDF.
Title Footer Size Select the Title Footer from the First page of the PDF.
Header Size Select the Header Size from the PDF.
Footer Size Select the Footer Size from the PDF.

Once the required selection is done, it will look like the picture below.

Data

Step 5

Once the selection is done click on Save Config button. Then click on OK, you will see the image below.

Data

Select the File from which data needs to be extracted.Here it can be single file or multiple file. Click on Next and then select where you want to past the data from PDF formula.

Table Area

For Selecting the Table Area option follow Fixed Area step 1 to 3.

Once the PDF is loaded click on the the Table Area operation. Table Area Tab will expand(see the picture below).

Data

Criteria Description
End Area The table area has a start area and a end area which needs to be specified.This is how while extracting data it will search for the start area for the begin of the table and the start of end area will specify the end of the table.
Next Start In this case, it is possible that the table doesn't have a specific end area. here if it gets the next Start area then it terminates the first one and start the next one.
End of Page Here the table ends on the same page.

Following are the parameters which needs to be filled foe Setting-up Table Area.

Table Area Parameter's :

Parameter Description
Start Area Start Area is specified in order to know from where the Table starts. Need to specify the X co-ordinate, Y co-ordinate, Height, Width. Can be Masked.
End Area End Area is specified in order to know where the Table end. Need to specify the X co-ordinate, Y co-ordinate, Height, Width. Can be Masked.
Table Area Table Area is the table from which actual data needs to be extracted. Need to specify the X co-ordinate, Y co-ordinate, Height, Width. This cannot be masked.
Masks Mask the value which isn't constant or the data which needs to be extracted.Select the X co-ordinate,Y co-ordinate,Height,Width There can be multiple Masks
Column Separator As the name suggests column separators are used to separate columns from the Table Area. Need to be specified for every partiton.
Row Separator As the name suggests column separators are used to separate columns from the Table Area.
Columns This is to give header for each Column which has been separated via Column Separator Parameter.
Title Header Size Select the Title header from the First page of the PDF.
Title Footer Size Select the Title Footer from the First page of the PDF.
Header Size Select the Header Size from the PDF.
Footer Size Select the Footer Size from the PDF.

Once the above is done Click on Save Config. Follow Fixed Area Step 5.