Credit :- Original code and CONCEPT/IDEA ►► @SANTHOSH C KURIAN Thank you for the Idea and Vision ......
It works for ------------------- -- Company Master -- Ledger Master -- Ledger Display -- Columnar Ledger Display -- Vouchers -- Voucher Register -- Voucher Register - Columnar -- DayBook -- DayBook - Columnar
It support all type of Files -- except EXE, DAT, COM and such dangerous files. All files can be opened in respective programs.
Thanks ... But more then less -- I get inspired and jolted to action (I am lazybones by default ) by innovative IDEAS from members ....... And also I get into action -- only and if one of my CLIENTS need the same ...........
Yes you can read PDF, Audio files, video files -- basically any file that you can view in Windows. All files will open in their respective default program.
I mean reading PDF and then showing data in Tally Report.. It could be done if the text and data in PDF is not compressed with deflate or other encoding..
But It would be just too much difficult to extract data from all kinds of PDF... Plus Camera scanned PDFs are a headache to parse
Even I think Chat GPT level program will fail to understand all kind of PDF/Doc So, I highly doubt if Tally solutions would be able to do it anytime soon.
That's not a big deal for Tally Solutions.. Since Zip and Unzip functions are there in Tally Developer, Zip/Unzip uses the same compression encoding, they can use them if the expose the deflate encoding/decoding functions.. they already have them. Only problem is Compression, other wise you can read the pdf file using Open File : Read: Text to read the PDF files
Sir you sure you can Just Read the PDF by Opening it as Text File?... I tried it but it doesn't seems to work...Do you have any working samples? I recently developed a PDF To Tally Solution but there I used Python and OCR to read PDFs
Save Below Mentioned texts in notepad then save file swith .pdf extension Sample Text 1.. It shows Hello World in PDF file Code: %PDF-2.0 1 0 obj <</Type /Catalog /Pages 2 0 R>> endobj 2 0 obj <</Type /Pages /Kids [3 0 R] /Count 1>> endobj 3 0 obj<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 500 800] /Contents 6 0 R>> endobj 4 0 obj<</Font <</F1 5 0 R>>>> endobj 5 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>> endobj 6 0 obj <</Length 44>> stream BT /F1 24 Tf 175 720 Td (Hello World!)Tj ET endstream endobj xref 0 7 0000000000 65535 f 0000000009 00000 n 0000000056 00000 n 0000000111 00000 n 0000000212 00000 n 0000000250 00000 n 0000000317 00000 n trailer <</Size 7/Root 1 0 R>> startxref 406 %%EOF in notepad then save file with .pdf extension Sample Text 2.00 It shows a Box and a Curved Line in PDF Code: %PDF-2.0 1 0 obj <</Type /Catalog /Pages 2 0 R>> endobj 2 0 obj <</Type /Pages /Kids [3 0 R] /Count 1 /MediaBox [0 0 500 800]>> endobj 3 0 obj<</Type /Page /Parent 2 0 R /Contents 4 0 R>> endobj 4 0 obj <</Length 61>> stream 175 720 m 175 500 l 300 800 400 600 v 100 650 50 75 re h S endstream endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000059 00000 n 0000000140 00000 n 0000000202 00000 n trailer <</Size 5/Root 1 0 R>> startxref 314 %%EOF Problem comes here .. after stream.. in most of the PDFs text/data between stream and endstream is compressed Code: 6 0 obj <</Length 44>> stream BT /F1 24 Tf 175 720 Td (Hello World!)Tj ET endstream endobj
Understood... But This method won't work in majority of the PDFs(Due to various issues Like Scanned Images in PDF) ... OCR would lot more easier than this method... Isme toh Full effort khud karna padega.
If it's about parsing Text from PDFs, then there is a free utility / library called Poppler which contains a bunch of utilities including 'pdftotext' (to read a PDF and output to text) -it is based on another program xpdfreader's code. Poppler's latest binary executable releases and it's source codes are available on github. Is this helpful to you in any way?