<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: On metadata, indexing, and mucking around with PDFs</title>
	<atom:link href="http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/</link>
	<description>A survival guide for the 21st century researcher</description>
	<pubDate>Fri, 22 Aug 2008 02:11:31 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
		<item>
		<title>By: Pandammonium: blogs [pandammonia]</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-17398</link>
		<dc:creator>Pandammonium: blogs [pandammonia]</dc:creator>
		<pubDate>Fri, 25 Jan 2008 00:21:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-17398</guid>
		<description>[...] On metadata, indexing, and mucking around with PDFs &#124; Academic Productivity [...]</description>
		<content:encoded><![CDATA[<p>[...] On metadata, indexing, and mucking around with PDFs | Academic Productivity [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4495</link>
		<dc:creator>Mark</dc:creator>
		<pubDate>Thu, 06 Sep 2007 00:25:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4495</guid>
		<description>I'm joining the thread a bit late, but I'm sure the discussion concerning document storage and attributes continues.

I appreciate journal articles in PDF format, and prefer it to text-based documents (as in full-text articles that come without proprietary format but in html) as it replicates the journal look and feel. I don't feel that this is too wedded to the paper age, but rather continues the investment we've all made in publishing and consuming the articles. 

I DO want to learn more about how to use metadata to my advantage, and will check into resources already mentioned. I use Thomson's Endnote, which I like (version X for mac - earlier incarnations were problematic) but which I wish could handle my PDFs better. In particular, I want Endnote not only to store them (tagged with keywords, etc.) but to functonally work with them: to embed citation data in the PDF, for example, or even my own abstract. I often generate my own PDFs from scans, so that they are not even searchable by text beyond the title.

For highlighting, I'm starting to work with SKIM, which uses the metadata capacity to store highlighting. Evidently the shortcoming at present is that the metadata may not be accessible to all programs. And as someone noted, transportability is key.</description>
		<content:encoded><![CDATA[<p>I&#8217;m joining the thread a bit late, but I&#8217;m sure the discussion concerning document storage and attributes continues.</p>
<p>I appreciate journal articles in PDF format, and prefer it to text-based documents (as in full-text articles that come without proprietary format but in html) as it replicates the journal look and feel. I don&#8217;t feel that this is too wedded to the paper age, but rather continues the investment we&#8217;ve all made in publishing and consuming the articles. </p>
<p>I DO want to learn more about how to use metadata to my advantage, and will check into resources already mentioned. I use Thomson&#8217;s Endnote, which I like (version X for mac - earlier incarnations were problematic) but which I wish could handle my PDFs better. In particular, I want Endnote not only to store them (tagged with keywords, etc.) but to functonally work with them: to embed citation data in the PDF, for example, or even my own abstract. I often generate my own PDFs from scans, so that they are not even searchable by text beyond the title.</p>
<p>For highlighting, I&#8217;m starting to work with SKIM, which uses the metadata capacity to store highlighting. Evidently the shortcoming at present is that the metadata may not be accessible to all programs. And as someone noted, transportability is key.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Academic Productivity &#187; The definitive hack for your music collection and how to use it to help you reach productivity nirvana: MusicIP review</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4195</link>
		<dc:creator>Academic Productivity &#187; The definitive hack for your music collection and how to use it to help you reach productivity nirvana: MusicIP review</dc:creator>
		<pubDate>Sun, 02 Sep 2007 17:11:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-4195</guid>
		<description>[...] I have talked about how managing music and academic paper collections are similar here; See also &#8216;noise for academics&#8216; by [...]</description>
		<content:encoded><![CDATA[<p>[...] I have talked about how managing music and academic paper collections are similar here; See also &#8216;noise for academics&#8216; by [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-350</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sun, 04 Mar 2007 16:03:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-350</guid>
		<description>@Kevin: Good point. In fact, in my case, I just have to use adobe acrobat (expensive!) simply to highlight and comment pdfs. Text would be better, with formatting being up to the user (e.g., CSS). Sometimes, I don't like the fonts or the fact that the paper is two-column. Not much you can do with it if it's in PDF format.

Problem is, I don't think any new format is going to take over pdf anytime soon. What happened to mp3 - ogg? Mp3 is proprietary, we pay a cannon when we buy an mp3 player. Ogg gives equivalent -if not better- quality. It is open-source, and here you don't find any of the typical criticisms ot OSS: "The interface sucks, too geeky" (there is no interface in a file format!). "The documentation sucks" (no doc either). But very few people I know use ogg (I do), and most mp3 players don't even support it.</description>
		<content:encoded><![CDATA[<p>@Kevin: Good point. In fact, in my case, I just have to use adobe acrobat (expensive!) simply to highlight and comment pdfs. Text would be better, with formatting being up to the user (e.g., CSS). Sometimes, I don&#8217;t like the fonts or the fact that the paper is two-column. Not much you can do with it if it&#8217;s in PDF format.</p>
<p>Problem is, I don&#8217;t think any new format is going to take over pdf anytime soon. What happened to mp3 - ogg? Mp3 is proprietary, we pay a cannon when we buy an mp3 player. Ogg gives equivalent -if not better- quality. It is open-source, and here you don&#8217;t find any of the typical criticisms ot OSS: &#8220;The interface sucks, too geeky&#8221; (there is no interface in a file format!). &#8220;The documentation sucks&#8221; (no doc either). But very few people I know use ogg (I do), and most mp3 players don&#8217;t even support it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-349</link>
		<dc:creator>Kevin</dc:creator>
		<pubDate>Sun, 04 Mar 2007 13:14:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-349</guid>
		<description>I don't care whether it supports XMP or not. There's no need to do PDF. We need to get over this what it looks like on paper mentaility. Give me text.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t care whether it supports XMP or not. There&#8217;s no need to do PDF. We need to get over this what it looks like on paper mentaility. Give me text.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-346</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sat, 03 Mar 2007 17:30:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-346</guid>
		<description>Thanks Atom, BadgerOne, Martin,

That's really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator (http://sourceforge.net/projects/pdfcreator/). This offers saving some base fields, but I doubt that's XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</description>
		<content:encoded><![CDATA[<p>Thanks Atom, BadgerOne, Martin,</p>
<p>That&#8217;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator (http://sourceforge.net/projects/pdfcreator/). This offers saving some base fields, but I doubt that&#8217;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jose</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-345</link>
		<dc:creator>jose</dc:creator>
		<pubDate>Sat, 03 Mar 2007 17:29:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-345</guid>
		<description>Thanks Atom, BadgerOne, Martin,

That's really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator. This offers saving some base fields, but I doubt that's XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</description>
		<content:encoded><![CDATA[<p>Thanks Atom, BadgerOne, Martin,</p>
<p>That&#8217;s really nice. Does anyone know of any PDF creator that writes XMP for those not using latex? I use the open source PDFcreator. This offers saving some base fields, but I doubt that&#8217;s XMP. Can you post a link to a pdf that has those XMP fields filled? What software other than jabRef can read, catalog and write XMP-enriched PDF?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-343</link>
		<dc:creator>Martin</dc:creator>
		<pubDate>Sat, 03 Mar 2007 12:52:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-343</guid>
		<description>XMP is an interesting solution. For those using LaTeX and BibTeX to manage their references, I recommend trying the new version 2.2 of JabRef (http://jabref.sourceforge.net).
It can write the BibTeX Metadata to a PDF file and also import the Metadata from a PDF file.

In other words - you can store the bibliographic information (Journal name, page range, Authors, Title, ...) in a structured way in the pdf files.

Even if the publishers do not yet provide this information in their pdf articles, the user already can add that information and benefit while searching for the right reference.

Martin</description>
		<content:encoded><![CDATA[<p>XMP is an interesting solution. For those using LaTeX and BibTeX to manage their references, I recommend trying the new version 2.2 of JabRef (http://jabref.sourceforge.net).<br />
It can write the BibTeX Metadata to a PDF file and also import the Metadata from a PDF file.</p>
<p>In other words - you can store the bibliographic information (Journal name, page range, Authors, Title, &#8230;) in a structured way in the pdf files.</p>
<p>Even if the publishers do not yet provide this information in their pdf articles, the user already can add that information and benefit while searching for the right reference.</p>
<p>Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: BadgerOne</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-330</link>
		<dc:creator>BadgerOne</dc:creator>
		<pubDate>Thu, 01 Mar 2007 14:11:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-330</guid>
		<description>atom prober is spot on with his comment</description>
		<content:encoded><![CDATA[<p>atom prober is spot on with his comment</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: atom prober</title>
		<link>http://www.academicproductivity.com/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-297</link>
		<dc:creator>atom prober</dc:creator>
		<pubDate>Sat, 24 Feb 2007 21:41:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.academicproductivity.com/blog/2007/on-metadata-indexing-and-mucking-around-with-pdfs/#comment-297</guid>
		<description>There is no reason to abandon PDF.  PDF supports &lt;a HREF="http://en.wikipedia.org/wiki/Extensible_Metadata_Platform" rel="nofollow"&gt;XMP&lt;/a&gt;.  XMP allows all the dublin core metadata that &lt;a href="http://www.zotero.org" rel="nofollow"&gt;Zotero&lt;/a&gt;, &lt;a href="http://refbase.sourceforge.net/" rel="nofollow"&gt;refbase&lt;/a&gt;, OpenOffice.org, and other products are using.

We just need to have publishers care enough to put this data in and more end-user tools to index/view/search/edit it.</description>
		<content:encoded><![CDATA[<p>There is no reason to abandon PDF.  PDF supports <a HREF="http://en.wikipedia.org/wiki/Extensible_Metadata_Platform" >XMP</a>.  XMP allows all the dublin core metadata that <a href="http://www.zotero.org" >Zotero</a>, <a href="http://refbase.sourceforge.net/" >refbase</a>, OpenOffice.org, and other products are using.</p>
<p>We just need to have publishers care enough to put this data in and more end-user tools to index/view/search/edit it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
