How can I build a distributed document processing platform in the cloud (like scribd.com)?
A user will upload large amounts of documents, like email, spreadsheets, images, pdfs, etc.
These documents will need metadata and text extraction, they will then be converted to .PDF documents for viewing.
Searching by keyword will be very important in the application.
Observing members:
0
Composing members: 0
Composing members: 0