Roughly weekly news and opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok?

Let’s start with a statement from the Keeper’s Registry, following the last meeting of this group and the wind up of current Jisc funded project funded work. This collaboration should continue ok as a loose network of the keepers themselves but I hope it gets the proper support it deserves from the rest of the community as this stuff is *important*:

Catch up on videos from Breaking Boundaries:

All the Jisc RDM stuff in one place, nice:

One for the diary:

Solid write up of self audit at UNT, with reference to some of the issues caught by the audit:

Here’s the first of a trio of great blog posts on the OPF site, the first delving into some file format ID performance issues…

…the second is an awesome look behind the scenes of a week in the life of David Clipsham, file format magic(ian) extraordinaire, with reference to another excellent post from Jen Mitcham and contributions from others.

…and the last is best summed up by the Owens:

The latest news leak on EU copyright reform, highlighting the current options and reaction from movers and shakers on twitter:

Great read on malware perservation, don’t miss this:

This is pretty nasty stuff…

…so it’s a good time to give a bump to the Doc Liberation folks who are gradually working through many of the format challenges created by file interoperability issues / vendor divergence of the past:

Wowzers, glad I got in early there. Good to have the old iPRES show back in Europe:

The Digital Preservation Awards are upon us, and the judges have been hard at work short listing….

…here’s what we came up with for the first award…

…the remaining shortlists will be out this week.

And on the DPC tip, we have a new webinar programme kicking off shortly for members:

A few final thoughts, and yes this is a biggie:

Nice feedback:

And with this mundane mention by the pointy haired boss, the term “big data” is now officially dead, yes?




Roughly weekly(?) news and opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok?

After a 3 month break, DP News is finally back! It’s been a busy time at DPC central for a number of reasons, not least of which because we had our 3 big events in one week: our latest Board Meeting, our annual members unconference day – Connecting the Bits, and a briefing day on file format obsolescence. DPC members can catch up with Euan Cochrane’s presentation on file formats here, and the twitter action was pretty decent if you fancy a recap:

Of course, all this digital preservation excitement might have drowned out the other big story in the news (if only), so I’m afraid I’m going to cover some of the #brexit reaction. Of course, we’re aiming to keep things solid here at the DPC:

And in times of political crisis it really is important to know exactly who said what before they (of course) deleted it.:

Not forgetting Politwoops of course.

Although the wider world still doesn’t quite understand preservation and archiving. No surprises there, I guess:

Funding is a big concern for our sector, but the Commission has been putting out reassuring messages…

The official line, and reality of previous funding going to non-EU members, suggests status quo for the time being. But the uncertainty and inevitable prejudices that emerge around such a divisive topic could take us elsewhere:

More here:

In other news, you may have seen me ranting about a lack of software preservation (particularly in the UK), but this looks encouraging:

It’s the latest round of the Digital Preservation Awards! Get your entry in now, all are welcome:

Some really detailed advice here:

More great progress from the Document Liberation folks. This plugs a serious obsolescence gap:

New sig files are out from TNA and PRONOM…

…and a bit more on plugging the research data coverage gaps:

Great to see that @JiscDataVault has been making good progress:

Significant release for the FITS tool. Release notes here:

Sharing some of the less positive stories from our community can be such a valuable way to learn. DPC is exploring how to encourage and support this kind of sharing, but in the meantime, check out this paper:

Some final thoughts and sillies…

Everyone loves a cluster:

Our favourite tag line from the Spruce Project was “Assume nothing, validate everything”. Damn right:

And my favourite #brexit gag. Nicely done:

Not truly obsolete, just a pain the backside…





Roughly weekly news and opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok?


Lets start with a new version of the veraPDF software which validates PDFs for you:

…and don’t forget:

Exactly what Charles says:

Excellent work from “Copyright for Knowledge”:

Knowledge Exchange report on software sustainability:

The Archives Unleashed hackathon looked fantastic…

…and there’s a great meta twitter blogged (nice) write up here:

#DPC ran “Standing on the digits of giants…” and I’d recommend checking out the tweets if you missed it. Lots of interesting titbits like this….

…and this:

Another great event coming up, so much on at the moment (and I didn’t even mention code4lib)!

Full proceedings are available with individual papers, not just in one horrible bundled PDF. Thanks iPRES!

New blog post from Sara Thompson on social media preservation:

Potentially massive news for digital preservation as a major OS gets built in fixity checking, repair and other cool stuff, on as default. Lets hope the licensing issues don’t get in the way:

I always shake my first like this when I make progress:

Highly unusual case of government funding being cut (without negotiation) due to research data not being open enough. Unfortunately looks more like an excuse for cutting funding than the dawning of a new age…

Here’s an equation for you:

Excellent media piece on digital preservation from our DPC President. God bless the DP of C:


PASIG came to european shores at the end of the week and it was a good show:

I was thinking of meta-blogging it, but Dave Askey has already proper-blogged it, so check this out:

Here’s a final thought, with some good discussion if you follow the tweet:

I love you, BYEEEEEEE



Roughly weekly news and opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok?

The digital preservation world let out a collective sigh at news that the UK would be shelling out a not inconsiderable amount of money to continue committing its laws to vellum.

Which all seemed a bit short sighted when our budgets for doing things like preserving valuable at risk digital information are being squeezed so much. Like this valuable stuff…

…like when a newspaper goes and deletes all its blog posts without any warning…

…it’s kind of a big deal, so our web archives need support. Let’s be blunt about this:

Ok, I didn’t mean quite that blunt, but that will do. It looks like the Internet Archive and the UK web archive has at least some of the deleted blogs, it would be great to hear more on this…

And more importantly the work of our members:

Anyway, it’s a good job digital preservation folks didn’t get distracted by their own pointless long lived storage mediums. Oh, hang on a minute…

Really? It’s been, errrm, at least a few weeks since the last permanent long lived storage medium was announced and then disappeared never to be heard of again…

…this made me feel (slightly) better, as the news of 5 dimensional (huh?) storage got tweeted everywhere:

We need some good news. How about this:

And this:

An update on digital transfers to the UK National Archives, and the challenges faced:

Here’s a question for you:

I wonder if we can get a straight answer on that?

Excellent. What’s next? Richard Lehane on his latest Siegried developments:

And a new release for DROID:

An excellent resource on an important subject:

On the disappearing web:

The first Software Preservation Network forum:

IDCC16 happened, and here are all the tweets via a nifty little archiving application:

The big news from the US was good news. Many were encouraged by the appointment of an actual librarian to the post, not to mention this:

There is of course always one wag:

Some more serious thoughts on the importance of this role:

As usual lets wrap up with something a bit lighter:

This is actually really old, but what a title!

Welcome. To the city that care forgot.



Roughly weekly news and opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok? There may be some bad words.

It’s February, what happened?

Well, I’ll do the last week and few highlights from the start of this year….

In case you missed it, check out my run down of the big issues in digital preservation from 2015:

Just because a PDF looks shiny and problem free on the outside, it isn’t necessarily hunky dory on the inside…

…Here’s Gary McGath’s reaction…

…and the analysis of the failed preservation planning process that kicked this off:

Various interesting preservation nuggets in here:

More format news. ePUB is *still* evolving in significant ways…

New format assessment from the BL:

“Simply put, DSpace 6 is where we are starting to build the *next* DSpace“:

On records management…

…and on the extreme end of the challenges records management is facing in the modern world:

Putting PREMIS actions into effect:

This got blogged and retweeted all over the place, so it’s an opportunity for me to feature my favourite gaming blog by stealth. And of course, its emulation in action again:

Great stuff from our colleagues in the Netherlands:

The final version of the phase 2 report from this project that is making Archivematica RDM friendly:

Great mission statement from the conclusion of this report on software preservation: “As NLM continues to research and pursue its own software preservation strategy, we hope to show our work, our mishaps, and our successes, in order to ensure that the digital stewardship community as a whole can grow and begin to tackle software in an enlightened and thorough manner.”

I think I might already have mentioned this, but it’s important, so for good measure:

Always read the small print. Unless it’s the small print for this blog. The invoice for every word you’ve downloaded is in the post….

The DPC held it’s annual student conference (and check the storify here):

And we said bon voyage to our DPC colleague Lorraine:

Apparently this was out for comment last year. Did anyone know? Needless to say, the twitter reaction was none too happy:

I’ll finish up with some of the usual sillies. You might need a dose of this after this week’s news…

…some nostalgia for the home computing geek of a certain age…

…And a final thought from everyone’s favourite Jason Scott:

People out here, it’s like they don’t even know the outside world exists. Might as well be living on the fucking Moon



Roughly annual opinions from the Digital Preservation Coalition’s Head of Research and Practice, Paul Wheatley. Opinions are the opinions of Paul and those featured. Not the DPC. They’re just opinions, ok?

By request, this is my summing up of the big themes in digital preservation from 2015. It’s based on my weekly meta-blogged trawl of the #digitalpreservation twitterverse and my experiences of chatting and working with our DPC members. As usual, these are of course still just my opinions.

The #digitalpreservation community finally sat up and made quite a lot of noise at the suggestion that there was going to be a “digital black hole”. I’m not sure we’re winning yet, but we’re fighting a good fight as the #nodigitaldarkage fun nicely reinforced. Clearly the digital preservation world is now a force to be reckoned with. We’ve really come a long way since our first tentative steps as a community in the late 1990s.

Copyright was as usual a fascinating but depressing topic. The various international trade deals (TTIP, TTP etc) marched onwards with rights holders at the reigns and little in the way of obstacles. It’s difficult of course to protest against something when you have to rely on leaks to discover what the fine print says. Whether individual countries will adopt when it comes to the (possibly) more democratic part of the process remains unclear. But it’s not looking good. UK law took several steps back, and frankly looked ludicrous, when the Copyright and Rights in Performances (Personal Copies for Private Use) Regulations 2014 were overturned by the High Court. Picturing any progress towards a more enlightened copyright regime (for our sector at least) is somewhat difficult in this context. I think the digital preservation community should be developing a unified and stronger voice on IPR in 2016.

In the UK, austerity has continued to erode and imperil many publicly-funded organisations. Digital preservation seems to have largely escaped the worst cuts, and the DPC’s membership has continued to rapidly increase as agencies instinctively look to collaborate where once they might have worked in isolation. Having said that, almost everywhere has been in a near constant state of reorganisation, taking an ongoing toll on staff time and morale. Continuous news of public library closures have been the public face of this colossal, short-sighted dismantling of information infrastructure. In my own backyard, the National Media Museum in Bradford survived a battle for survival, only in the last week to see announcements apparently to the contrary. There are of course many other examples right across the country.

Outsourcing digital preservation became a much more realistic proposition in 2015, particularly with web archiving but also with cloud based (full digital preservation) services hitting their stride. In one sense this is great. There are people we can have some trust in to do DP for us. In another sense, this creates another tricky problem. You can’t ever really trust anyone to do any preservation for you. How do you verify that they are doing the job well, without duplicating the work you’ve paid them to do?

Emulation continued its rapid march towards a seemingly inevitable dominance. The DPC proclaimed 2014 as the year of emulation (well actually that was me, DPC just gave the big DP Awards prize to our esteemed colleagues at Freiburg) but in 2015 technology advances and real applications that demonstrate the approach as both practical and cost effective, appeared thick and fast. While the emulation technology is rapidly moving from clever concept to practical product, the raw material is being left behind. Software preservation, at least in the UK, seems to be largely ignored as I blogged about some months back. Where is our Software Preservation Network?

ERMS’s seem to be reaching practical end of life for many organisations, amidst the death of any optimistic hopes of records management delivering substantial rewards for those picking up the pieces of the “managing current business data” phase of the information lifecycle. Is this grossly oversimplifying the issues? Almost certainly, but it’s what I heard a lot of in 2015. ERMS vendor lock continued to get in the way of moving archival data to the preservation store. There is of course a near complete lack of DP requirements in these systems being used. There was the usual issue of poor adherence to records management procedures but it’s the IT solutions that often fail to make the procedures practical for those on the ground. There were headline hitting data leaks. And of course there was a blizzard of data that in most cases did a lot more than threaten an overwhelming of capability. These seem to be amongst the main drivers of this broader challenge, but there are more. The response appears to be an alarming move towards mass deletion, political moves to squash FOI, indiscriminate broad strokes selection policies, and more simple information management systems that don’t look like they’re going to do much better than the ERMS’s of old. The outcome is of course a lot of digital junk to appraise. I think we need a complete sea change in how we implement appraisal and we need a drive towards the intelligent application of software tools for sifting and filtering at scale, drawing on the kind of information analysis our web archiving colleagues are getting rather good at. Without placing too fine a point on it, the huge leap required here will not be easy for many archives to make. Richard Ovenden noted in his State of the Coalition address, as the new DPC President, that “mismanaging personal data is the corporate scandal du jour…” whether in the public or private sector. Getting serious about information management and preservation seems to be more essential than ever for avoiding short term disaster as well as ensuring long term sustainability.

Digital preservation seems to finally have broken through into the commercial world in earnest. This was signposted by the vendor showcase at iPRES at the end of 2014, and of course repeated at the same (really successful) conference in North Carolina in 2015. With a big marketing drive, Preservica dominated the market in the UK, although Arkivum seem to be doing none too badly as well. Libnova was the ‘new’ commercial offering on the block. Not to omit the progression of open source solutions such as Archivematica, which saw significant and exciting additions from many sectors. Commercial interest in DPC membership was on the rise in a significant way, and it feels like the digital preservation community is going to change in its make up quite rapidly over the next few years. Interesting times ahead…

Want to know what 2016 will hold? Well I can’t see into the future, so you’ll have to just put up with reading my news blog every week….

Thanks to the DPC team for their feedback and contributions to this post.


2015 Xmas Quiz

I’ve not quite finished my end of year round up yet, so this week we’re going a little off piste…

Last week I tested out my special Xmas quiz on my fellow DPC team members to ensure it was fit for purpose. As always, the team was very helpful and did not shy away from giving me constructive feedback. “It’s too obscure”, “This is hard, really hard”, “I won, and I only got 7 out of 25 correct”, and “What were we meant to be doing again?”. Clearly the quiz was, erm, fit for purpose and ready to hit the blog without any further work. So here it is!

In almost every DP News I have a little fun and wrap things up with a quote of some kind. Sometimes verbatim, sometimes corrupted with some digital preservation reference. Often connected to something in the blog post. This seemed perfect to pull together into one place, creating a quiz without me spending very much effort on it. Of course, that does mean it is a little obscure, as my colleagues observed. And yes, it has nothing to do with Xmas or 2015 for that matter..

To help you out just a little bit, many of the quotes are last lines or “famous” trademark sign offs, but not all. They come from films, novels, TV shows, songs, a TV advert and a poem. All you have to do is work out where the lines are from. Who said them, what the show was, the name of the movie or whatever.

1 That is all, go away
2 It’s good night from my Orchid ID. And it’s good night from his
3 I think he’s attempting re-ingest sir.
4 Good night and good luck
5 So long and thanks for all the websites
6 Game over man, game over
7 But it was all right, everything was all right, the struggle was finished. He had won the victory over himself. He loved OAIS
8 Safe journey, space fans… wherever you are.
9 Good fight, good night!
10 Peas out
11 I was cured all right
12 I took one more glance over my past life, then turned to the future. I was eager to embrace the world.
13 I hear the roar of the big machine
14 And so castles made of sand, slips into the sea, eventually
15 Round up all the AIPs in a field…
16 Alright, let’s get straight to the biscuits…
17 The creatures outside looked from digital object to digital object, but already it was impossible to say which was which.
18 Mess with the best, (still it seems) die like the rest
19 Oh Good. For a moment there, I thought we were in trouble.
20 All together now: Incorrect disc for-mat. It’s an incorrect disc for-mat….
21 Chapel Hill, with this conference you were spoiling us. Exchellente.
22 Tis better to have archived and lost everything, Than never to have tried to do it in the first place
23 Information Transit got the wrong man. I got the right man. The wrong one was delivered to me as the right man, I accepted him on good faith as the right man. Was I wrong?
24 There’s more to life than a little metadata you know. Don’tcha know that? And here y’are, and it’s a beautiful day. Well. I just don’t understand it.
25 Don’t hit me with them negative waves so early in the morning. Think the digital object will be there and it will be there. It’s a mother, beautiful digital object, and it’s gonna be there. Ok?

I’ll publish the “answers” in the new year. Obviously a bit of Googling would illicit most of the answers rather easily, but I know you wouldn’t cheat would you? Given that there is no prize, you would of course just be cheating yourself…

We have Santa Claus. Tell him we have Santa for sale

2015 Xmas Quiz