This is a bare bone utilities to convert a stream onto text for full text search purpose. There's some other alternative but none of them run with a small footprint. At the moment it supports: - office documents - pdf (TODO: remove dependency on pdftotext) - text base files