Wikipedia:WikiProject Józef Piłsudski Institute of America/Scalable Archive Project
Welcome | Tools and Help | Tasks | Participants | Scalable Archive Project |
Project idea[edit]There are 3.5 million categories on the Wikimedia Commons[edit]We will develop and implement categorization practices that can be scaled, adapted and will help other GLAM projects grow. Categories on Wikimedia are currently underutilized by GLAM projects that would greatly benefit from them, such as document-based archives. Developing categorization practices will help other archive projects organize their digitized collections on Wikimedia and grow their open-source digital collections as more items are digitized. The Piłsudski Institute of America GLAM-Wiki Project will be the test-bed for deeper, meaningful, implementation of Wikimedia categories. More categories, more context[edit]Archives using a structured data source (such as an archive using the EAD metadata standard) will be able to generate several levels of categorization for their Wikimedia Commons collections. We think that implementing intermediate categories that give context to the documents presented in our collections will make GLAM projects more attractive for Wiki users. Generating these categories from a data source structured to current broadly implemented standards will make the process of creating a GLAM collection on Wikimedia easier. While there are millions of categories available on Wikimedia, the vast majority of current Wikimedia Commons collections have a broad category, such as “paintings,” followed by an alphabetical listing of content. There is no automatic solution for categorization either during or after a batch upload, but we would at least like to develop ways to ease the linking stand-alone category trees to the millions of categories already available. Intermediate categorization based on already available metadata would allow an archive to contextualize their work in ways that are easier to navigate for the Commons user. Making it easier to link stand-alone categorization to existing categories would benefit both Wikimedia and GLAM projects. Rather than an alphabetical listing, a collection of paintings would be grouped by era, style, or artist, depending on what information the metadata contains and what Wikimedia categories exist. Individual items could appear in multiple categories depending on how they were tagged in the metadata, utilizing a key feature of the Wiki format. Project goals[edit]To help GLAM projects harness the power of the Wikimedia Commons and make them better to browse for users. Project plan[edit]Our script is thus far able to create a category tree by mimicking fonds and folder hierarchy used by the Piłsudski Institute archive. The results of running the script can be seen here: Collections of Józef Piłsudski Institute of America by fonds These “fonds” are automatically added to files by the [Template:Piłsudski Institute document], while category description is created by [Template:Józef Piłsudski Institute of America Category Description] and [Module:Józef Piłsudski Institute of America Template:Józef Piłsudski Institute of America Category Description]. The “Accession number” field in [Template:Piłsudski Institute document], adds links to those new categories. Step One[edit]We hope to utilize not only folder numbers but also titles and document tags to enable better browsing. This will be generated from metadata provided it is available and desired. Collections that utilize this information will be easier to create and become friendlier data sources to browse. Step Two[edit]We would like to make connecting stand-alone categories created from metadata to already existing Wikimedia categories a less manual process. The more experience we gain by experimenting with our own Wikimedia Commons collections, the more streamlined we can make the process for others. Step Three[edit]Documenting our findings is the only way that other projects will be able to use our work and we hope to be thorough in providing it. Ultimately we would like to have contributed something that other projects will want to adopt. Step Four[edit]We hope that our project could serve as another place where categorization could be discussed, with GLAM projects specifically in mind (but not exclusively).
Activities[edit]We have already developed code to transfer our archival categories to our Wikimedia Commons collection. At the same time as code is developed we want to focus on creating sensible categorization practices. We will:
We assume the following improvements will be necessary as we work with the current code:
Community engagement[edit]The Piłsudski Institute of America GLAM-Wiki project has a record of engaging a volunteers. It has thus far resulted in donating about 1,200 documents from the Institute archives to Wikimedia Commons, along with over 50 new Wikipedia articles in both English and Polish. We are working on this project because we would like to add documents to Wikimedia Commons faster and more meaningfully. With a working script, the institutional volunteers we regularly attract to our project will be more efficient in sharing our collection. After creating a resource page that gathers links to Wiki and non-Wiki resources documenting categorization practices, we hope to create a discussion space where categorization can be collaboratively brainstormed and align our project with practices developed there. We hope that such a discussion might attract collaborators and make our work useful for a broad audience and not just ourselves. Using our work, we hope that other Wiki projects will be able to streamline their own uploading processes and contextualize their collections in a more meaningful way, thus hopefully make their projects more attractive for Wikimedia users and their own project volunteers. Potential Collaborations[edit]
Sustainability[edit]The project will continue as we continue to add documents from our collections. Other GLAM projects will be able to use our work and hopefully make the code better as well. Measures of success[edit]Quantitative[edit]
Qualitative[edit]
Get involved[edit]Participants[edit]Project Manager[edit]Lukasz Chelminski is a doctoral candidate in the History department at the CUNY Graduate Center and an adjunct instructor at Brooklyn College. He is a proponent of digital pedagogy and has used web resources in the classroom extensively. Lukasz is the Wikipedian-in-residence at the Piłsudski Institute of America. Through his experience at the Institute he hopes to develop a pedagogy strategy which will introduce his students to Wiki-editing through collaborative class projects. Volunteer Coder[edit]Volunteer Metadata Specialist[edit]Marek Zielinski Community Notification[edit]Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?
Endorsements[edit]Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project in the list below. (Other constructive feedback is welcome on the talk page of this proposal).
|