Search the full text in PDF files

From LemonWiki共筆
Revision as of 10:39, 19 June 2018 by Planetoid (talk | contribs)
Jump to navigation Jump to search

Ψ 個人知識管理的搜尋任務類型: 電腦檔案或內文搜尋 | 多個PDF檔案的內文搜尋 | 自訂搜尋服務(網頁歸檔) | Gmail搜尋 Ψ


尋找多個PDF檔案裡的資料(PDF跨文件全文搜索)

Suggestion

  • full text search: Adobe reader is good choice because they highlight and locate the keywords you type.
  • metadata search: Metadata is the data of data. You can fulfill the information of author, keywords when you generated the PDF file. PDF Explorer or xPDFSearch (Total Commander extension) are both good choices to perform the metadata search.



Comparison of Solutions

PDF type Software / service full text search metadata search comments
Good.gif Text-PDF Adobe reader 7.0.7 or Adobe acrobat OK OK (but slow) (1)the search function combined the full-text and metadata search, (2) locate the keywords you type
Text-PDF Adobe SHARE beta OK (English only) No access: 2007-11-28
Good.gif Text-PDF Fox Reader v. 5.1.0 (Foxit Reader Portable) OK No able to locate the keywords you typed
Text-PDF EvernotePremium Feature: PDF Search($) OK (Searching keyword in Chinese is Ok) and quickhttps://www.planetoid.info/images/Good.gif OK Highlight the keyword you search. But not locate the position of keyword.Icon_exclaim.gif access: 2012-06-27
Text-PDF GMail No No but Gmail search supports searching the filename in Chinese. access: 2007-05-17
Text-PDF Google desktop search v4 OK, but only index the first 10,000 words Title only
Text-PDF Locate32 3.0.8.1200 No, only find some words OK (English only) access: 2008-02-07
Text-PDF PDF Explorer 1.5 OK OK (1)not highlight and locate the keywords you type; (2)extract and index the internal images
Text-PDF PDF-XChange Viewer 1.0 (Build 0017) OK No (1) Search "elearning" will find "creative learning", "e-Learning", and "elearning."; (2)異塵行者的介紹
Text-PDF Windows Desktop Search 02.06.5000.5378 OK (with PDF IFilter[1]) OK ex: author:someone
Text-PDF Windows Search 4.0 OK OK (中文可) (1)not highlight and locate the keywords you type; (2)indexing too many filetypes and not easy to be customized
Text-PDF xPDFSearch 1.02 (Total Commander extension) OK OK not highlight and locate the keywords you type
Text-PDF Yahoo! Desktop Search 1.2 OK No (1)not highlight and locate the keywords you type; (2)not support Chinese folder name
Good.gif Text-PDF Yahoo! Mail OK No support English only. access: 2007-05-17
Image-PDF Google desktop search + OmniPage Search Indexer OK, but only index the first 10,000 words Title only Quick, English Only

pdfgrep for Linux Os linux.png & Mac icon_os_mac.png [1]

  • PDF type: Text-PDF
  • Full text search: Available
  • Metadata search: Not available
  • Annotation search:
  • Chinese issue: ok
  • Indexing for better performance:
  • Locate the keywords you type: ok
  • Support boolean search: (1) OR: To matches the content contains TERM_A or TERM_A e.g. pdfgrep -n --max-count 10 TERM_A|TERM_B foo.pdf (2) AND: Add the option -P, --perl-regexp[2]. To matches the content contains TERM_A and TERM_B e.g. pdfgrep -n --max-count 10 -i -P '(?=.*TERM_A)(?=.*TERM_B).*' foo.pdf
  • Comments:

$ PDF Search v. 1.7 for Mac icon_os_mac.png

  • PDF type: Text-PDF. Not for Image-PDF.
  • Full text search: Available
  • Metadata search: Not available
  • Annotation search:
  • Chinese issue: Icon_exclaim.gif Not ok!
  • Indexing for better performance: Available
  • Locate the keywords you type: Available
  • Support boolean search: Available. See details on Narrate Results.
  • Comments: Good for searching PDF documents in English. There are still some technical issues in Chinese.

Microsoft OneNote [Last visited: 2018-06-19]

  • PDF type: Text-PDF / Image-PDF
  • Full text search: Available.
  • Metadata search:
  • Annotation search:
  • Chinese issue:
  • Indexing for better performance:
  • Locate the keywords you type: Not highlight the location of matched keyword. Icon_exclaim.gif
  • Support boolean search:
  • Comments:

PDF type

  • Text-PDF: The PDF file generated from text files. 由文件檔轉成的PDF檔
  • Image-PDF: The PDF file generated from image files. 由圖檔轉成的PDF檔

(left blank intentionally)

* PDF type: Text-PDF / Image-PDF
* Full text search: 
* Metadata search: 
* Annotation search:
* Chinese issue:
* Indexing for better performance:
* Locate the keywords you type: 
* Support boolean search:
* Comments: 

其他組織管理PDF文件的軟體

  • Adobe - Digital Editions v1.0.467
    • organization: bookshelf(folder: PDF檔僅能置於一個bookshelf)
    • search: 僅能搜尋單一PDF檔內的文字,無法跨檔案搜尋。


Further reading

References