As the Çatalhöyük Archive Report 2006 is only available online as a large PDF, I've copied the report below, but you can find additional reporting of my work in specialist reports like the Figurines report. I also contributed to the Çatalhöyük blog during the 2006 season.
Çatalhöyük blog posts, 2006:
- Catalhoyuk diaries: What I did on my summer holidays (September 2006)
- Catalhoyuk diaries: August 2006
- Catalhoyuk diaries: Settling in (July 2006)
IT Team / Veri Tabanı Ekibi – Mia Ridge & Sarah Jones
Team: Mia Ridge (Museum of London Systems Team), Richard May (Museum of London Systems Team), Sarah Jones.
The work of centralising the Çatalhöyük datasets into one integrated database continued over the winter months and throughout the 2006 season. The work comprised of gathering datasets, cleaning up data problems with the help of team members and creating interfaces or improving existing ones to work with the centralised database. We used the project forum to call for feedback on team databases, and discuss any issues with downloading and using the forms.
The advantages of the integrated system have been highlighted by our ability to begin work on formalising terminologies to enable better cross team communication and searching, sharing data between previously separate applications and the development of a web search facility available to the public.
Çatalhöyük veri takımlarını tek bir veri tabanı altında toplamak için yaptığımız çalışmalar kış ve kazı sezonu boyunca devam etti. Çalışmalar veri takımlarını biraraya getirmek, ekip üyelerinin verilerle ilgili problemlerini çözmek ve verileri birbirine bağlayarak ya da daha
önceden bağlanmış verileri geliştirerek merkezi veri tabanı içinde çalışmalarını sağlamak gibi işlemlerden oluşmaktadır. Ekibin veri tabanları hakkında fikir edinmesi için projenin tartışma forumunu kullanarak, formların indirilip kullanılabilmesiyle ilgili problemleri
Biraraya getirilmiş sistemin avantajları imkanlarımız doğrultusunda ön plana çıkarılmaya çalışılarak, ekipler arasında daha iyi bir komunikasyon ve araştırma sağlamak için terminolojilerin belirginleştirilmesi, önceden ayrılmış olan uygulamalar arasındaki veri
paylaşımının oluşturulması ve halka açık olan bir internet sitesinin geliştirilmesi için çalışmalara başlamak gerekmektedir.
This year we are pleased to welcome Sarah Jones to the team. With the generous sponsorship from Lynn Meskell, Columbia University, the project was able to advertise for a full-time 'Access Application Developer'. The focus of this role was to complete the basic development and integration of the Çatalhöyük Research Project's main database under the guidance of the MoLAS IT team. Sarah was the successful applicant and started in November. Her initial contract was for six months but she was also able to go out on site to provide database support this year and continues to work with the project on a freelance basis.
The project invested in some new equipment at Çatalhöyük this year to improve the IT infrastructure. A number of new laptop computers were purchased and made available to team members to use to enter their data into the centralised database.
A number of improvements were also made to the network by purchasing new switches that increase network speed plus extra hubs to allow more users to connect to the network at once.
The number of power failures began to prove a problem for those users connecting to the network via the wireless routers so three UPSs (Uninterruptible Power Supplies) were bought to provide about 10 minutes extra power, enough time for the users to save their data and log off the network.
The project also purchased a new external hard drive to store the large image catalogue as well as to provide an additional back up facility. This drive is portable so can be carried between the UK and site.
Public Database Access Online
One of the greatest advantages of the centralisation of the Çatalhöyük datasets was reflected in our ability to create a web interface with on-line search facilities in a relatively short amount of time.
The web interface allows cross platform accessibility to the data, both to browse and to search. A first generation system has been designed which allows basic browsing to complex searching of the entire excavation and diary databases.
Work on the public database web access followed discussions held on site in the 2005 season. It was agreed that core data up to the present date would be published from all lab teams. The project fully appreciates the sensitivity of research data and we are working closely with each team to ensure only permissible data is released, therefore work is on going. Each team is able to state the extent to which their data can be published by defining what is 'core' and what is 'specialist' for their data, as well as any restrictions by content area (to the table and field level) or by date. Core data is defined as un-interpreted, inventory level, excavation and field data.
The lab team data is being released in staggered tranches depending on each teams needs for research and publication. For example, some teams may be happy to publish all their core data, but wish to reserve specialist data until formal publication (for example, publish specialist data except for the last three years). The general aim is to publish specialist data sets on the public website around the time the same data is published in the volumes.
The current on-line facility includes some trial hotspot plans and photos (e.g. Building 48 and Space 229) to more easily tally data with physical location, plus links to the image catalogue held in Portfolio.
The development process for the website involved techniques such as devising wireframes (plans of web pages and the flow of control between them) based on our experience with similar projects, these were sent to the project team for comment on how they think Çatalhöyük data can be usefully and engagingly presented. The feedback received from the team on the way the data is accessible has been invaluable and this process is on going.
We surveyed team members about the software they would be able to use on- and off-site. The interfaces to the Çatalhöyük site databases are written in Microsoft Access version 2000, which is available as part of Office 2000 Professional. It is necessary that we continue to develop the systems in this version as some users still only have this, and the site computers run it. The site databases will run in later versions of Access but we have not fully tested them.
With Amy Bogaard's help we have gathered pre-2003 data from the varied files maintained by the previous team and added these to the data already centralised. We also worked on importing the data from the years 2003 – 2005 into the new structure created last season and developed a new data entry interface with exporting and reporting capabilities. This work continued on site where the new system was used for the first time and where the flotation log was directly available to other teams in their systems.
Beads and material artefacts
This database is an exciting development as it draws data from different materials-based databases, into an interface that links their formal characteristics and adds layers of artefact-based recording.
It will record stone, clay and bones beads initially. The interface will draw data from the Faunal and Excavation databases, as well as the new Clay and upcoming Architectural databases, and future Glass and Shell databases. It will also allow specialists to record attributes specific to beads and related artefacts. The recording structure will include material and material sub-type; object class, object type and object sub-type. The certainty of material sub-type identification can be recorded, as well as the method of analysis that led to the identification. Like the Clay database, this includes visual methods and sampling/analysis.
The latest dataset is now stored by the project and will be centralised in the future.
The pre-1999 dataset was centralised in 2005. The chipped stone team since 1999 have worked on devising a new recording system to suit their work and the material now appearing on site. This data structure, held in Excel, was translated into a database structure, which highlighted areas where the data needed to be cleaned. A new data entry interface was created which was used on site this season. The chipped stone team were able to take full advantage of the new integrated database by drawing data directly from the excavation database and flotation log.
Specialisms covered by this database include stamp seals, figurines, clay balls, ceramics, other shaped objects and building materials.
The available datasets were centralised in their existing structures over the winter. The clay objects specialists have been involved in on-going discussions with Mia about a shared recording model.
Overall, the goals of this project are to:
- implement shared value lists and recording codes
- implement the core/specialist data model developed in previous years
- enable comparison of artefacts across specialisms
- create an extensible system that allows for new ways of understanding clay
objects at Çatalhöyük.
Supporting tables will be created for previous data sets of bulk data so that a unified view of all clay objects from the site can be created regardless of when and how it was originally recorded.
Where recording structures have changed significantly and previous data can't be mapped to a similar level of specificity, original fabric descriptors (colour, etc) will be kept and can be displayed on specialist forms so that information can be re-created from fabric types and descriptions. Munsell colours can be mapped to field samples so that previous data can be integrated into the updated structure.
Issues to be resolved included the primary keys to be used to link tables across shared recording structures and into specialist recording, the structure of the basic Clay Unit Description table, the basic artefact Finds data, as well as sample and bulk recording and agreement on basic Materials, Material sub-types, Artefact types and artefact sub-types. Understanding the data structures required is not just a technical process and investigation included an analysis of the semantic meaning embedded in existing recording structures.
A model of recording was emerging where divisions fall naturally between fabric (the original matrix, before human modification but including things like naturally occurring inclusions), manufacturing (including surface treatments), and post-manufacturing (for example, use and environmental wear and post-depositional events). However, the model needs more consideration to ensure of the reliability of observations and with regard to practicalities such as the amount of time spent recording each artefact.
This season we were lucky enough to work with Chris Doherty, a geoarchaeologist from the Research Laboratory for Archaeology and the History of Art, Oxford. He was heavily involved in discussions about the best way to record the technical characteristics of clay artefact. Working with Chris has been an important factor in our attempt to move from descriptive to diagnostic recording.
Structurally, characteristics that occur across the lifetime of an artefact such as variations in colour and exposure to heat/fire have been grouped between original fabric, manufacturing and post-manufacturing but they can be gathered together on the one form or interface for ease of recording if required by specialists. The structure also allows the probability to be recorded to allow for the uncertainties inherent in the material.
Material aspects include fabric and inclusions, including original matrix and tempering. Manufacture elements include qualitative characteristics, surface treatment, and fire exposure. Use and post-depositional changes include use and environmental wear, condition, fragmentation, and possible intentional damage.
The structure also includes artefact-specific tables to allow specialists to record data to their exact requirements. Representational object recording can also be used to link clay objects from across the site like figurines and stamp seals with other artefacts such as wall paintings through characteristics such as pose and representational form.
This season we spent a lot of time on changes requested to the Ceramics database during the season as the recording moved from recording individual sherds to recording finds grouped by ware with a Çatalhöyük-specific typology developed by the Ceramics team. We also participated in discussions about the integration of previous West Mound pottery data with existing databases.
The conservation dataset was centralised in 2005 and its existing interface linked to the integrated database. On site this season a major re-design was undertaken of the underlying data structure to facilitate its ability to link to other databases, primarily the excavation and finds data. The interface was adapted to reflect these changes and a number of new features, including reports, were introduced. The conservation database was also used as the example to show how photographs can be directly linked into a lab system.
This dataset was centralised in previous years. On-site this season the existing interface required some minor fixes.
The excavation data was centralised in previous years and its existing interface linked to the integrated database. Off-site work focused on highlighting areas of data cleaning and improving the data entry interface to ensure the integrity of the data was maintained in future.
In addition to this a number of data structure alterations were undertaken to take advantage of the relational power of SQL Server.
Further cleaning is required to resolve discrepancies such as sample types. This may also require the introduction of sample sub-types or materials.
Mia worked with Louise Martin and Lisa Yeomans in London to centralise faunal data and test the centralised interfaces, continuing the work of the past seasons on data integrity and forms. The implementation of relational integrity highlighted data cleaning issues and a report was sent to Louise and Nerissa Russell outlining the rows that needed review.
Data cleaning and centralisation continued on site this season. The existing interface was adapted to ensure data integrity was maintained in future by placing greater control on movement between screens. As the Artefacts interface was used intensively for the first time some issues were discovered and resolved on site.
Some modifications will be required to support the display of object locations from the Finds database.
Work began on re-designing the Finds recording system in 2005, the existing data having already been centralised previously. Over the winter the data and structure was analysed and the gathering of requirements for improving the system undertaken. This process continued on site with Julie Cassidy’s valuable input on how best to formalise terminologies of material and object types to enable accurate searching of the data and to allow easier linking with other datasets.
A major re-design of the data structure that records finds at the point they are brought into the finds room was undertaken (previously known as the x-finds sheet) and this change required a lot of data cleaning to formalise terminologies. Work on improving the crates register was also begun and this work is on going.
Karen Wright had worked hard during the year to devise a data structure in Excel and this was translated into a database structure so the existing data could be imported into the centralised system. A new interface was devised and work is on going to improve this.
The heavy residue dataset was centralised in 2005 and on site this season a few minor interface changes were implemented and areas for data cleaning identified. This system has benefited from the centralisation with its ability to now link directly to the flotation log and excavation database.
We also undertook some cleaning during the season and added functionality for conditional data entry, such as making fields available as appropriate for a given material.
Requirements analysis began in previous seasons. Mia sent over data structures that would form the basis of the application. The team then created an interface to allow them to start recording their data. This fantastic work highlighted how it is within each teams capability to create working systems within a very short space of time using Microsoft Access as a development tool.
We have worked closely with Jason Quinlan to implement processes to make metadata about images that are catalogued in the image management application Portfolio available via the centralised database. This idea was prototyped in the excavation and conservation databases where functionality was added so images are directly linked and viewable from their related records. A mechanism for exchanging metadata between Portfolio and SQL Server has been set up to keep both systems up to date with changes in each system.
The available microfauna data was centralised over the winter and the existing interface improved and new interface requirements implemented. The centralised system was used for the first time this season. There is on-going work on centralising the Bach microfauna data to complete the existing dataset.
An assessment was made of the work to be carried out on the existing database and this is on-going.
After an assessment of the existing recording structures in Excel and current requirements, a database structure was designed and the data centralised. A new interface was developed and this work continued on site.
Other Database Work
A full security model for the centralised database was planned and implemented over the winter. The permissions model allowed each team to enter and modify their own data, and read but not modify data from other teams. This ensured that the entire team could benefit from the data centralisation while guaranteeing the integrity of their data.
In season 2005 the AllTables concept was developed to allow team members and researchers access to all permissible data (read-only) by providing them with a file where they could store their own queries, forms and reports. This idea was developed further over the winter to ensure any database changes were automatically reflected in the AllTables file. This new version was used on site this season by a number of researchers.
Training materials and documentation written in season 2005 were extended in response to questions raised by the team. General IT documentation for on-site team members was updated. Documentation work is ongoing as new databases are developed, functionality is added to existing databases and the training needs of the team become apparent.
The lack of documentation for functionality of existing databases had hindered our ability to develop cross-platform solutions and to move validation and integrity to the back-end, hopefully we will be able to build a body of documentation for previously existing applications that will enable these modifications in future.
All the work done on the data centralisation would not have been possible without the fantastic co-operation of all the teams and we are exceptionally grateful for their time, patience and expertise.
The advantages of the centralisation process have again been highlighted this season by the ability for different teams to share data, for example excavation data and the flotation log, which has reduced data duplication and related errors.
The benefits of a relational database became apparent during the process of cleaning data previously held in non-relational applications. Duplication, invalid codes and data integrity problems that were previously undetected were discovered and resolved.
There is still much work to do. Some datasets have yet to be integrated and the new interfaces which are in their infancy can be further developed in the future and evolve as recording needs change. The data published on the public website can also be extended to make more of the centralised data available and discussions are on going with the lab teams to release permissible records. This season has seen the benefit of the previous years work in the way data is stored and used across teams.