<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: On metadata, indexing, and mucking around with PDFs</title>
	<atom:link href="http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/</link>
	<description>A survival guide for the 21st century researcher</description>
	<lastBuildDate>Wed, 01 Feb 2012 09:37:35 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Lincoln</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-140802</link>
		<dc:creator>Lincoln</dc:creator>
		<pubDate>Thu, 01 Sep 2011 21:48:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-140802</guid>
		<description>It is kind of disappointing that this article is several years old and there still is no better solution for simplifying the organization and searchability of PDFs. I think the biggest explanation for this, that was mentioned in the article, is that people are still thinking in a paper world. Jose is right, academic papers are largely still written and classified in a way that is more conducive to printing. It&#039;s time for the hangers on to let go and realize that brick and mortar libraries are basically a thing of the past.

On a side note, I love your logo on your banner. My brain feels like that about 5 nights a week.</description>
		<content:encoded><![CDATA[<p>It is kind of disappointing that this article is several years old and there still is no better solution for simplifying the organization and searchability of PDFs. I think the biggest explanation for this, that was mentioned in the article, is that people are still thinking in a paper world. Jose is right, academic papers are largely still written and classified in a way that is more conducive to printing. It&#8217;s time for the hangers on to let go and realize that brick and mortar libraries are basically a thing of the past.</p>
<p>On a side note, I love your logo on your banner. My brain feels like that about 5 nights a week.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Concrete Driveway Price</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-140327</link>
		<dc:creator>Concrete Driveway Price</dc:creator>
		<pubDate>Mon, 15 Aug 2011 02:46:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-140327</guid>
		<description>You know, I never gave metadata a second thought on PDF&#039;s.  I mean, sure for websites but not for downloadable documents.  Learn something new everyday.</description>
		<content:encoded><![CDATA[<p>You know, I never gave metadata a second thought on PDF&#8217;s.  I mean, sure for websites but not for downloadable documents.  Learn something new everyday.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Frank</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-138635</link>
		<dc:creator>Frank</dc:creator>
		<pubDate>Thu, 28 Apr 2011 05:30:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-138635</guid>
		<description>Essentially all Reference Management Software available today performs extremely poorly when it comes to metadata. Mendeley and Zotero claim to be able to scan PDFs and automatically retrieve information such as author, title, year, but their performance is dismal. I have hundreds of PDFs and in 99 out of 100 files both are unable to retrieve accurate information. Worse - Zotero PDF scan does not work on 64bit systems. 
Both Zotero and Mendeley do not support writing to XMP that means even if one enters metadata that information is not stored inside the PDF. JabRef is the only software that supports writing XMP to PDF, but does not scan PDFs for existing metadata. Endnote neither scans/retrieves metadata nor does it permit to write to the XMP of the PDF.
What every simple music program like iTunes does with music files, what any kind of photo management software does on the fly - writing metadata tags into the files and searching the internet for more details - NOT one single Reference Management Software can do!!!
Really, really sad...</description>
		<content:encoded><![CDATA[<p>Essentially all Reference Management Software available today performs extremely poorly when it comes to metadata. Mendeley and Zotero claim to be able to scan PDFs and automatically retrieve information such as author, title, year, but their performance is dismal. I have hundreds of PDFs and in 99 out of 100 files both are unable to retrieve accurate information. Worse &#8211; Zotero PDF scan does not work on 64bit systems.<br />
Both Zotero and Mendeley do not support writing to XMP that means even if one enters metadata that information is not stored inside the PDF. JabRef is the only software that supports writing XMP to PDF, but does not scan PDFs for existing metadata. Endnote neither scans/retrieves metadata nor does it permit to write to the XMP of the PDF.<br />
What every simple music program like iTunes does with music files, what any kind of photo management software does on the fly &#8211; writing metadata tags into the files and searching the internet for more details &#8211; NOT one single Reference Management Software can do!!!<br />
Really, really sad&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pandammonium: blogs [pandammonia]</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-17398</link>
		<dc:creator>Pandammonium: blogs [pandammonia]</dc:creator>
		<pubDate>Thu, 24 Jan 2008 23:21:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-17398</guid>
		<description>[...] On metadata, indexing, and mucking around with PDFs &#124; Academic Productivity [...]</description>
		<content:encoded><![CDATA[<p>[...] On metadata, indexing, and mucking around with PDFs | Academic Productivity [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-4495</link>
		<dc:creator>Mark</dc:creator>
		<pubDate>Thu, 06 Sep 2007 00:25:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4495</guid>
		<description>I&#039;m joining the thread a bit late, but I&#039;m sure the discussion concerning document storage and attributes continues.

I appreciate journal articles in PDF format, and prefer it to text-based documents (as in full-text articles that come without proprietary format but in html) as it replicates the journal look and feel. I don&#039;t feel that this is too wedded to the paper age, but rather continues the investment we&#039;ve all made in publishing and consuming the articles. 

I DO want to learn more about how to use metadata to my advantage, and will check into resources already mentioned. I use Thomson&#039;s Endnote, which I like (version X for mac - earlier incarnations were problematic) but which I wish could handle my PDFs better. In particular, I want Endnote not only to store them (tagged with keywords, etc.) but to functonally work with them: to embed citation data in the PDF, for example, or even my own abstract. I often generate my own PDFs from scans, so that they are not even searchable by text beyond the title.

For highlighting, I&#039;m starting to work with SKIM, which uses the metadata capacity to store highlighting. Evidently the shortcoming at present is that the metadata may not be accessible to all programs. And as someone noted, transportability is key.</description>
		<content:encoded><![CDATA[<p>I&#8217;m joining the thread a bit late, but I&#8217;m sure the discussion concerning document storage and attributes continues.</p>
<p>I appreciate journal articles in PDF format, and prefer it to text-based documents (as in full-text articles that come without proprietary format but in html) as it replicates the journal look and feel. I don&#8217;t feel that this is too wedded to the paper age, but rather continues the investment we&#8217;ve all made in publishing and consuming the articles. </p>
<p>I DO want to learn more about how to use metadata to my advantage, and will check into resources already mentioned. I use Thomson&#8217;s Endnote, which I like (version X for mac &#8211; earlier incarnations were problematic) but which I wish could handle my PDFs better. In particular, I want Endnote not only to store them (tagged with keywords, etc.) but to functonally work with them: to embed citation data in the PDF, for example, or even my own abstract. I often generate my own PDFs from scans, so that they are not even searchable by text beyond the title.</p>
<p>For highlighting, I&#8217;m starting to work with SKIM, which uses the metadata capacity to store highlighting. Evidently the shortcoming at present is that the metadata may not be accessible to all programs. And as someone noted, transportability is key.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Academic Productivity &#187; The definitive hack for your music collection and how to use it to help you reach productivity nirvana: MusicIP review</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-4195</link>
		<dc:creator>Academic Productivity &#187; The definitive hack for your music collection and how to use it to help you reach productivity nirvana: MusicIP review</dc:creator>
		<pubDate>Sun, 02 Sep 2007 15:11:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4195</guid>
		<description>[...] I have talked about how managing music and academic paper collections are similar here; See also &#8216;noise for academics&#8216; by [...]</description>
		<content:encoded><![CDATA[<p>[...] I have talked about how managing music and academic paper collections are similar here; See also &#8216;noise for academics&#8216; by [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-350</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sun, 04 Mar 2007 16:03:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-350</guid>
		<description>@Kevin: Good point. In fact, in my case, I just have to use adobe acrobat (expensive!) simply to highlight and comment pdfs. Text would be better, with formatting being up to the user (e.g., CSS). Sometimes, I don&#039;t like the fonts or the fact that the paper is two-column. Not much you can do with it if it&#039;s in PDF format.

Problem is, I don&#039;t think any new format is going to take over pdf anytime soon. What happened to mp3 - ogg? Mp3 is proprietary, we pay a cannon when we buy an mp3 player. Ogg gives equivalent -if not better- quality. It is open-source, and here you don&#039;t find any of the typical criticisms ot OSS: &quot;The interface sucks, too geeky&quot; (there is no interface in a file format!). &quot;The documentation sucks&quot; (no doc either). But very few people I know use ogg (I do), and most mp3 players don&#039;t even support it.</description>
		<content:encoded><![CDATA[<p>@Kevin: Good point. In fact, in my case, I just have to use adobe acrobat (expensive!) simply to highlight and comment pdfs. Text would be better, with formatting being up to the user (e.g., CSS). Sometimes, I don&#8217;t like the fonts or the fact that the paper is two-column. Not much you can do with it if it&#8217;s in PDF format.</p>
<p>Problem is, I don&#8217;t think any new format is going to take over pdf anytime soon. What happened to mp3 &#8211; ogg? Mp3 is proprietary, we pay a cannon when we buy an mp3 player. Ogg gives equivalent -if not better- quality. It is open-source, and here you don&#8217;t find any of the typical criticisms ot OSS: &#8220;The interface sucks, too geeky&#8221; (there is no interface in a file format!). &#8220;The documentation sucks&#8221; (no doc either). But very few people I know use ogg (I do), and most mp3 players don&#8217;t even support it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-349</link>
		<dc:creator>Kevin</dc:creator>
		<pubDate>Sun, 04 Mar 2007 13:14:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-349</guid>
		<description>I don&#039;t care whether it supports XMP or not. There&#039;s no need to do PDF. We need to get over this what it looks like on paper mentaility. Give me text.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t care whether it supports XMP or not. There&#8217;s no need to do PDF. We need to get over this what it looks like on paper mentaility. Give me text.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-346</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sat, 03 Mar 2007 17:30:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-346</guid>
		<description>Thanks Atom, BadgerOne, Martin,

That&#039;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator (http://sourceforge.net/projects/pdfcreator/). This offers saving some base fields, but I doubt that&#039;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</description>
		<content:encoded><![CDATA[<p>Thanks Atom, BadgerOne, Martin,</p>
<p>That&#8217;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator (<a href="http://sourceforge.net/projects/pdfcreator/" rel="nofollow">http://sourceforge.net/projects/pdfcreator/</a>). This offers saving some base fields, but I doubt that&#8217;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/comment-page-1/#comment-345</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sat, 03 Mar 2007 17:29:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-345</guid>
		<description>Thanks Atom, BadgerOne, Martin,

That&#039;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator. This offers saving some base fields, but I doubt that&#039;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</description>
		<content:encoded><![CDATA[<p>Thanks Atom, BadgerOne, Martin,</p>
<p>That&#8217;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator. This offers saving some base fields, but I doubt that&#8217;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

