Conference Information

=============================================
  Document Recognition and Retrieval XII (EI-117)
  Call for Papers and Announcement
=============================================

Please email questions to 


This conference is part of the IS&T/SPIE's International Symposium on
Electronic Imaging 2004, 17-20 January 2005
San Jose Marriott and San Jose Convention Center, San Jose, CA, USA

Conference Chairs: Elisa H. Barney Smith, Boise State Univ.; Kazem
Taghva, Univ. of Nevada/Las Vegas

Program Committee: James Allan, Univ. of Massachusetts/Amherst; Tim
Andersen, Boise State Univ.; Apostolos Antonacopoulos, Univ. of
Liverpool (United Kingdom); Francine R. Chen, Palo Alto Research Ctr.;
Xiaoqing Ding, Tsinghua Univ. (China); David S. Doermann, Univ. of
Maryland/College Park; Hiromichi Fujisawa, Hitachi, Ltd. (Japan);
Jianying Hu, IBM Thomas J. Watson Research Ctr.; Matthew F. Hurst,
Intelliseek, Inc.; Tapas Kanungo, IBM Almaden Research Ctr.; Xiaofan
Lin, Hewlett-Packard Labs.; Daniel P. Lopresti, Lehigh Univ.; Thomas A.
Nartker, Univ. of Nevada/Las Vegas; Sargur N. Srihari, Univ. at Buffalo;
George R. Thoma, National Library of Medicine; Marcel Worring, Univ. van
Amsterdam (Netherlands);  Berrin A. Yanikoglu, Sabanci Univ. (Turkey)

The fields of document recognition and retrieval have grown rapidly in
recent years. This development has been fueled by rising accuracy rates
for omnifont and handprint optical character recognition (OCR),
decreasing costs for the computational power needed to run such
sophisticated algorithms, and the emergence of new application areas
such as the World Wide Web (WWW), digital libraries, and video- and
camera-based OCR. The use of OCR is spreading from high-volume, niche
domains to more general tasks, including the processing of noisy
"real-world" documents, photocopies, and faxes.

Beyond OCR, document recognition includes the recovery of a document's
logical structure and format. This encompasses decomposing a document
into its various fundamental components (sentences, paragraphs, figures,
tables, etc.), tagging these units, and then determining a higher-level
structure for the document as a whole. Advanced machine learning
techniques may allow to fully recover the structure of tables and
equations and thus understand their content, or the conversion of line
drawings from raster to a vector format where the resulting graphical
objects are endowed with semantic meaning. Syntactic representation of
logical structure (e.g. using grammars) and syntax-directed recognition
is another important area where research contributions are solicited.

One primary reason for digitizing existing paper materials is, of
course, to simplify retrieval and organization of information. Therefore
we are particularly interested in papers which address any of the
following issues: (1) retrieval in the face of corrupted readings of the
terms in a document; (2) retrieval based on sketches, images, tables,
diagrams or other non-linguistic objects that appear in the document;
(3) retrieval based on text appearing with non-standard alignment, in
images or graphics; (4) recognition and tagging of mathematical arrays
and equations which serve as indicators of subject content or
methodology used in the document; (5) novel methods for retrieval and
organization of information based on text or other information in a
document. Papers addressing retrieval-specific issues are encouraged to
use a standard methodology from either statistics (such as the ROC
representation) or IR (such as precision versus recall) to assess the
effectiveness of proposed techniques against the endpoint goal of
correct recognition and retrieval of the entire document, or a section
thereof.

Papers are solicited in the following areas:

Recognition
* algorithms and systems for machine-printed and handwritten character
and word recognition, especially for degraded documents (e.g., faxes or
old/historical documents)
* large scale conversion of historical document collections
* quality assurance methods and systems in DRR
* character and word segmentation techniques
* identification and analysis of tables or equations
* page segmentation, including hierarchical decomposition of documents
into text regions, colored/textured background, halftones, line-art,
etc.
* logical structure analysis, linguistic representation of structure
and syntax-directed recognition of logical structure
* raster-to-vector conversion of line-art, maps, and technical
drawings
* filtering and enhancement techniques for document images
* document image compression
* document degradation models
* video- and camera-based OCR
* applications of document recognition to the WWW and digital
libraries
* techniques to support spoken language access to document text (audio
browsing of document databases)
* multilingual character recognition
* other topics relating to document analysis and character
recognition.

Retrieval
* impact of recognition accuracy on retrieval effectiveness
* recovery and use of logical structure for retrieval
* information extraction from forms
* relevance feedback techniques for document retrieval
* cross-language and multi-lingual retrieval
* categorization of text documents and imaged documents
* summarization of text documents and imaged documents
* keyword spotting in document images
* approximate string matching algorithms for OCR text
* non-textual retrieval methods
* image and multimedia search
* interfaces for retrieval
* benchmarking and evaluation issues
* other topics relating to the retrieval of documents and document
images.

Note: submissions to Document Recognition and Retrieval XI should be
abbreviated papers (5-7 pages). The paper should informative and address
the following questions:
  i) What is the paper about?
  ii) What is the original contribution?
  iii) What is the most closely related work by others and how does
this work differ?
  iv) How can others make use of this work?
  v) What are the main experimental/theoretical results?
Full papers (10-12 pages) will be needed for the final proceedings.

For more information and submission instructions, please see:
http://www.electronicimaging.org or
http://electronicimaging.org/call/05/conferences/index.cfm?fuseaction=EI117

Abbreviated papers (5-7 pages) Due Date: 5 July 2004
Manuscript (10-12 pages) Due Date: 25 October 2004
Final Summary (200 word abstract for program book) Due Date: 15
November 2004

Proceedings of this conference will be published and available at the
meeting.  The Abstract and Manuscript due dates must be strictly
observed.

Submissions imply the intent of at least one author to register, attend
the symposium, and present the paper (either orally or in poster
format).