Multiple answer sheets will be provided that must be processed. Following are general requirements that will apply to all answer sheets:
Program must accept multi-page TIFF images. It would be good if program can accept multipage PDF images also but not mandatory.
Multi-page document may include answer sheets of different types in the same file.
First step of scanning process must be identification of answer sheet template. If image cannot be identified it must be sent to separate folder for user to manually assign sheet using another workstation.
System must scale images based on registration marks (Rectangles in corners).
System must accept images that are skewed or misaligned.
System must have configuration file for modifying conditions for minimum fill % to detect a filled bubble. If no bubbles in the zone meet minimum fill %, then output is blank.
System must have configuration file for modifying condition for comparing two or more filled bubbles and selecting correct bubble. For example, user may define that selected bubble must be 10% more filled than other bubbles or it considers this as multiple bubbles. Multiple bubbles have output of *.
Each answer sheet will have text and bar-code information that must be processed. In all cases, data is both Bar code and text. System should read both and if they match, use the date. If barcode and text do not match, select the one that system is most confident in. System must provide method for sending sheets that have missing or questionable data in these zones to separate queue so that user on another machine can process these without stopping primary machine. Secondary process must allow for easy viewing of questionable areas and ability for user to manually input data. For example, if student ID is not read, screen should show that area of the answer sheet and provide area for user to type in student ID.
Final file output must be usable by eDoctrina. At completion of scanning, image must be converted to PDF with filename based on information in the image (District ID, Student ID and Test ID).
Scanned from a Xerox multifunction device001.tif
Dewey SLO Qtr 3 #2 Redo 001.tif
Scanned from a Xerox multifunction device002.tif
Scanned from a Xerox multifunction device003.tif
Scanned from a Xerox multifunction device001.tiff
Scanned from a Xerox multifunction device004.tif
{ name: 'FASTeST Small'
, darknessPercent: ''
, darknessDifferenceLevel: ''
, regions: { { name: "district_id", outputPosition: "0" type: "barCode&Text"
, areas: { { type: "barCode", left: "10", top: "10", width: "100", height: "10" }
, { type: "numbersText", left: "10", top: "5", width: "100", height: "5" }
}
, active: true|false
}
, { name: "test_id", outputPosition: "1", type: "barCode&Text"
, areas: { { type: "barCode", left: "10", top: "30", width: "100", height: "10"} }
{"type":"numbersText","left":1307,"top":201,"width":26,"height":41 }
, active: true|false
}
, { name: "sheetIdentifier"
, type: "barCode&Text"
, areas: { { type: "barCode", left: "10", top: "30", width: "100", height: "10" }
, { type: "numbersText", left: "10", top: "5", width: "100", height: "5" }
}
}
, { name: "index_of_first_question"
, type: "barCode&Text"
, areas: { { type: "barCode", left: "10", top: "30", width: "100", height: "10" }
, { type: "numbersText", left: "10", top: "5", width: "100", height: "5" } }
, active: true|false
}
, { name: "amout_of_questions"
, type: "barCode&Text"
, areas: { { type: "barCode", left: "10", top: "30", width: "100", height: "10" }
, { type: "numbersText", left: "10", top: "5", width: "100", height: "5" } }
, active: true|false
}
, { "name":"leftTopBlackBox",
"type":"marker"
, areas: { left: "10", top: "30", width: "100", height: "10" }
, active: true|false
}
, { "name":"rightTopBlackBox"
"type":"marker"
, areas: { left: "10", top: "30", width: "100", height: "10" }
, active: true|false
}
, { "name":"rightBottomBlackBox"
, "type":"marker"
, areas: { left: "10", top: "30", width: "100", height: "10" }
, active: true|false
}
, { "name":"leftBottomBlackBox"
, "type":"marker"
, areas: { left: "10", top: "30", width: "100", height: "10" }
, active: true|false
}
, { name: "answers", outputPosition: "4", indexOutputPosition: "3", type: "bubblesRegions"
, areas: [{ left: "10", top: "40", width: "100", height: "100"
, lineHeight: "10", bubblesPerLine: "10"
, bubble: {"x":248,"y":752,"width":62,"height":38}
}
,{ left: "10", top: "40", width: "100", height: "100"
, lineHeight: "10", bubblesPerLine: "10"
, bubble: {"x":248,"y":752,"width":62,"height":38}
}]
, active: true|false
}
}
, additionalOutputData: { { outputPosition: 10, value: "%fileName%" }
, { outputPosition: 11, value: "la-la-la" } }
, outputFileNameFormat: "{district_id}_{test_id}_{student_uid}_{index_of_first_question}_{amout_of_questions}"
}
List of sheet attributes
Name | Required | Description |
---|---|---|
name | no | Sheet name |
regions | yes | List of regions |
additionalOutputData | no | Some additionl data which must go into output file |
outputFileNameFormat | yes | Format for output file name. Here {region_name} must be replaced with value taken from regarding region. For example: {district_id}_{test_id}_{student_uid}_{firstQuestionIndex}_{numberOfQuestions}
|
List of sheet regions
Name | Required | Description |
---|---|---|
name | no | Region name |
outputPosition | no | Position in output file |
type | yes | Type of region. See below. |
areas | yes | List of region areas |
value | no | Value found in region. Only for sheetIdentifier region. |
active | no | Active by default but could be deactivated by adding active: false |
Name | Description |
---|---|
singleValue | Single value - such as district id or teacher name for example |
bubblesRegion | Bubbles region |
firstQuestionIndex | Index of first question on the page |
numberOfQuestions | Number of questions on the page |
sheetIdentifier | Answer sheet identifer - we need to search for this on the page and compare |
leftTopBlackBox, rightTopBlackBox, leftBottomBlackBox, rightBottomBlackBox | Black boxes positions |
List of region areas
Name | Required | Description |
---|---|---|
left, top | yes | Coordinates of left top position |
width, height | yes | Dimensions of the area |
type | no | Type of area data. See below. |
lineHeight | no | Height of one bubbles line. Only for bubblesRegion region. |
bubblesPerLine | no | Amount of bubble per line. Only for bubblesRegion region. |
Name | Description |
---|---|
barcode | Area contain barcode |
text | Area contain text |
capitalText | Area contain text in capital letters |
numbersText | Area contain text with number only |
Set of additional hardcoded values in output.
All output position indexes starts from 1 (not 0)
Basically all you need is to use OutputPosition attrbite from config file. In our example output fille be like below:
district_id,test_id,,0,answer[0],,,,,,,fileName,la-la-la
district_id,test_id,,1,answer[1],,,,,,,fileName,la-la-la
...
district_id,test_id,,10,answer[10],,,,,,,fileName,la-la-la
outputPosition
and indexOutputPosition
|
, for example 1|5
68,5789,00142674,0,1,,,,,,,input01.tif,la-la-la
68,5789,00142674,6,1|2|3,,,,,,,input01.tif,la-la-la
~
with values from second line, for example 1|2~5|6
. ~
must always present for such answer sheets. 68,5789,00142674,0,1|2|3~0|7|8|9,,,,,,,input01.tif,la-la-la
68,5789,00142674,6,0~0,,,,,,,input01.tif,la-la-la
Image notes:
1 |
student_uid |
2 |
test_id |
3 |
test_id |
4 |
student_uid |
5 |
Sheet type - SMALL |
6 |
Sheet type - SMALL |
7 |
district_id |
8 |
district_id |
9 |
Amount of questions on this page. Not required for OCR. |
10 |
Amount of questions on this page. Not required for OCR. |
11 |
Index of first question on this page |
12 |
Index of first question on this page |
13 |
Bubbles area |
14 |
Bubbles area |
12
kwcsdadmin / safe2pas