Archive report: Çatalhöyük Archive Report 2006

As the Çatalhöyük Archive Report 2006 is only available online as a large PDF, I’ve copied the report below, but you can find additional reporting of my work in specialist reports like the Figurines report. I also contributed to the Çatalhöyük blog during the 2006 season.

IT Team / Veri Tabanı Ekibi – Mia Ridge & Sarah Jones
Team: Mia Ridge (Museum of London Systems Team), Richard May (Museum of London Systems Team), Sarah Jones.

Abstract

The work of centralising the Çatalhöyük datasets into one integrated
database continued over the winter months and throughout the 2006
season. The work comprised of gathering datasets, cleaning up data
problems with the help of team members and creating interfaces or
improving existing ones to work with the centralised database. We
used the project forum to call for feedback on team databases, and
discuss any issues with downloading and using the forms.

The advantages of the integrated system have been highlighted by our
ability to begin work on formalising terminologies to enable better
cross team communication and searching, sharing data between
previously separate applications and the development of a web search
facility available to the public.

Özet

Çatalhöyük veri takımlarını tek bir veri tabanı altında toplamak için
yaptığımız çalışmalar kış ve kazı sezonu boyunca devam etti.
Çalışmalar veri takımlarını biraraya getirmek, ekip üyelerinin verilerle
ilgili problemlerini çözmek ve verileri birbirine bağlayarak ya da daha
önceden bağlanmış verileri geliştirerek merkezi veri tabanı içinde
çalışmalarını sağlamak gibi işlemlerden oluşmaktadır. Ekibin veri
tabanları hakkında fikir edinmesi için projenin tartışma forumunu
kullanarak, formların indirilip kullanılabilmesiyle ilgili problemleri
tartıştık.

Biraraya getirilmiş sistemin avantajları imkanlarımız doğrultusunda ön
plana çıkarılmaya çalışılarak, ekipler arasında daha iyi bir
komunikasyon ve araştırma sağlamak için terminolojilerin
belirginleştirilmesi, önceden ayrılmış olan uygulamalar arasındaki veri
paylaşımının oluşturulması ve halka açık olan bir internet sitesinin
geliştirilmesi için çalışmalara başlamak gerekmektedir.

Team news

This year we are pleased to welcome Sarah Jones to the team.
With the generous sponsorship from Lynn Meskell, Columbia University, the project
was able to advertise for a full-time ‘Access Application Developer’. The focus of this
role was to complete the basic development and integration of the Çatalhöyük
Research Project’s main database under the guidance of the MoLAS IT team. Sarah
was the successful applicant and started in November. Her initial contract was for six
months but she was also able to go out on site to provide database support this year
and continues to work with the project on a freelance basis.

IT Infrastructure

The project invested in some new equipment at Çatalhöyük this year to improve the
IT infrastructure. A number of new laptop computers were purchased and made
available to team members to use to enter their data into the centralised database.

A number of improvements were also made to the network by purchasing new
switches that increase network speed plus extra hubs to allow more users to connect to
the network at once.

The number of power failures began to prove a problem for those users connecting to
the network via the wireless routers so three UPS’s (Uninterruptible Power Supplies)
were bought to provide about 10 minutes extra power, enough time for the users to
save their data and log off the network.

The project also purchased a new external hard drive to store the large image
catalogue as well as to provide an additional back up facility. This drive is portable so
can be carried between the UK and site.

Public Database Access Online

One of the greatest advantages of the centralisation of the Çatalhöyük datasets was
reflected in our ability to create a web interface with on-line search facilities in a
relatively short amount of time.

The web interface allows cross platform accessibility to the data, both to browse and
to search. A first generation system has been designed which allows basic browsing to
complex searching of the entire excavation and diary databases.

Work on the public database web access followed discussions held on site in the 2005
season. It was agreed that core data up to the present date would be published from all
lab teams. The project fully appreciates the sensitivity of research data and we are
working closely with each team to ensure only permissible data is released, therefore
work is on going. Each team is able to state the extent to which their data can be
published by defining what is ‘core’ and what is ‘specialist’ for their data, as well as
any restrictions by content area (to the table and field level) or by date. Core data is
defined as un-interpreted, inventory level, excavation and field data.

The lab team data is being released in staggered tranches depending on each teams
needs for research and publication. For example, some teams may be happy to publish
all their core data, but wish to reserve specialist data until formal publication (for
example, publish specialist data except for the last three years). The general aim is to
publish specialist data sets on the public website around the time the same data is
published in the volumes.

The current on-line facility includes some trial hotspot plans and photos (e.g. Building
48 and Space 229) to more easily tally data with physical location, plus links to the
image catalogue held in Portfolio.

The development process for the website involved techniques such as devising
wireframes (plans of web pages and the flow of control between them) based on our
experience with similar projects, these were sent to the project team for comment on
how they think Çatalhöyük data can be usefully and engagingly presented. The
feedback received from the team on the way the data is accessible has been invaluable
and this process is on going.

Databases

General

We surveyed team members about the software they would be able to use on- and off-
site. The interfaces to the Çatalhöyük site databases are written in Microsoft Access
version 2000, which is available as part of Office 2000 Professional. It is necessary
that we continue to develop the systems in this version as some users still only have
this, and the site computers run it. The site databases will run in later versions of
Access but we have not fully tested them.

Archaeobots

With Amy Bogaard’s help we have gathered pre-2003 data from the varied files
maintained by the previous team and added these to the data already centralised. We
also worked on importing the data from the years 2003 – 2005 into the new structure
created last season and developed a new data entry interface with exporting and
reporting capabilities. This work continued on site where the new system was used for
the first time and where the flotation log was directly available to other teams in their
systems.

Beads and material artefacts

This database is an exciting development as it draws data from different materials-
based databases, into an interface that links their formal characteristics and adds
layers of artefact-based recording.

It will record stone, clay and bones beads initially. The interface will draw data from
the Faunal and Excavation databases, as well as the new Clay and upcoming
Architectural databases, and future Glass and Shell databases. It will also allow
specialists to record attributes specific to beads and related artefacts. The recording
structure will include material and material sub-type; object class, object type and
object sub-type. The certainty of material sub-type identification can be recorded, as
well as the method of analysis that led to the identification. Like the Clay database,
this includes visual methods and sampling/analysis.

Charcoal Analysis

The latest dataset is now stored by the project and will be centralised in the future.

Chipped Stone

The pre-1999 dataset was centralised in 2005. The chipped stone team since 1999
have worked on devising a new recording system to suit their work and the material
now appearing on site. This data structure, held in Excel, was translated into a
database structure, which highlighted areas where the data needed to be cleaned. A
new data entry interface was created which was used on site this season. The chipped
stone team were able to take full advantage of the new integrated database by drawing
data directly from the excavation database and flotation log.

Clay objects

Specialisms covered by this database include stamp seals, figurines, clay balls,
ceramics, other shaped objects and building materials.

The available datasets were centralised in their existing structures over the winter.
The clay objects specialists have been involved in on-going discussions with Mia
about a shared recording model.

Overall, the goals of this project are to:
· implement shared value lists and recording codes
· implement the core/specialist data model developed in previous years
· enable comparison of artefacts across specialisms
· create an extensible system that allows for new ways of understanding clay
objects at Çatalhöyük.

Supporting tables will be created for previous data sets of bulk data so that a unified
view of all clay objects from the site can be created regardless of when and how it
was originally recorded.

Where recording structures have changed significantly and previous data can’t be
mapped to a similar level of specificity, original fabric descriptors (colour, etc) will be
kept and can be displayed on specialist forms so that information can be re-created
from fabric types and descriptions. Munsell colours can be mapped to field samples so
that previous data can be integrated into the updated structure.

Issues to be resolved included the primary keys to be used to link tables across shared
recording structures and into specialist recording, the structure of the basic Clay Unit
Description table, the basic artefact Finds data, as well as sample and bulk recording
and agreement on basic Materials, Material sub-types, Artefact types and artefact sub-
types. Understanding the data structures required is not just a technical process and
investigation included an analysis of the semantic meaning embedded in existing
recording structures.

A model of recording was emerging where divisions fall naturally between fabric (the
original matrix, before human modification but including things like naturally
occurring inclusions), manufacturing (including surface treatments), and post-
manufacturing (for example, use and environmental wear and post-depositional
events). However, the model needs more consideration to ensure of the reliability of
observations and with regard to practicalities such as the amount of time spent
recording each artefact.

This season we were lucky enough to work with Chris Doherty, a geoarchaeologist
from the Research Laboratory for Archaeology and the History of Art, Oxford. He
was heavily involved in discussions about the best way to record the technical
characteristics of clay artefact. Working with Chris has been an important factor in
our attempt to move from descriptive to diagnostic recording.

Structurally, characteristics that occur across the lifetime of an artefact such as
variations in colour and exposure to heat/fire have been grouped between original
fabric, manufacturing and post-manufacturing but they can be gathered together on
the one form or interface for ease of recording if required by specialists. The structure
also allows the probability to be recorded to allow for the uncertainties inherent in the
material.

Material aspects include fabric and inclusions, including original matrix and
tempering. Manufacture elements include qualitative characteristics, surface
treatment, and fire exposure. Use and post-depositional changes include use and
environmental wear, condition, fragmentation, and possible intentional damage.

The structure also includes artefact-specific tables to allow specialists to record data
to their exact requirements. Representational object recording can also be used to link
clay objects from across the site like figurines and stamp seals with other artefacts
such as wall paintings through characteristics such as pose and representational form.

This season we spent a lot of time on changes requested to the Ceramics database
during the season as the recording moved from recording individual sherds to
recording finds grouped by ware with a Çatalhöyük-specific typology developed by
the Ceramics team. We also participated in discussions about the integration of
previous West Mound pottery data with existing databases.

Conservation

The conservation dataset was centralised in 2005 and its existing interface linked to
the integrated database. On site this season a major re-design was undertaken of the
underlying data structure to facilitate its ability to link to other databases, primarily
the excavation and finds data. The interface was adapted to reflect these changes and
a number of new features, including reports, were introduced. The conservation
database was also used as the example to show how photographs can be directly
linked into a lab system.

Diary

This dataset was centralised in previous years. On-site this season the existing
interface required some minor fixes.

Excavation

The excavation data was centralised in previous years and its existing interface linked
to the integrated database. Off-site work focused on highlighting areas of data
cleaning and improving the data entry interface to ensure the integrity of the data was
maintained in future.

In addition to this a number of data structure alterations were undertaken to take
advantage of the relational power of SQL Server.

Further cleaning is required to resolve discrepancies such as sample types. This may
also require the introduction of sample sub-types or materials.

Faunal

Mia worked with Louise Martin and Lisa Yeomans in London to centralise faunal
data and test the centralised interfaces, continuing the work of the past seasons on
data integrity and forms. The implementation of relational integrity highlighted data
cleaning issues and a report was sent to Louise and Nerissa Russell outlining the rows
that needed review.

Data cleaning and centralisation continued on site this season. The existing interface
was adapted to ensure data integrity was maintained in future by placing greater control on movement between screens. As the Artefacts interface was used intensively
for the first time some issues were discovered and resolved on site.

Some modifications will be required to support the display of object locations from
the Finds database.

Finds

Work began on re-designing the Finds recording system in 2005, the existing data
having already been centralised previously. Over the winter the data and structure was
analysed and the gathering of requirements for improving the system undertaken. This
process continued on site with Julie Cassidy’s valuable input on how best to formalise
terminologies of material and object types to enable accurate searching of the data and
to allow easier linking with other datasets.

A major re-design of the data structure that records finds at the point they are brought
into the finds room was undertaken (previously known as the x-finds sheet) and this
change required a lot of data cleaning to formalise terminologies. Work on improving
the crates register was also begun and this work is on going.

Groundstone

Karen Wright had worked hard during the year to devise a data structure in Excel and
this was translated into a database structure so the existing data could be imported into
the centralised system. A new interface was devised and work is on going to improve
this.

Heavy Residue

The heavy residue dataset was centralised in 2005 and on site this season a few minor
interface changes were implemented and areas for data cleaning identified. This
system has benefited from the centralisation with its ability to now link directly to the
flotation log and excavation database.

We also undertook some cleaning during the season and added functionality for
conditional data entry, such as making fields available as appropriate for a given
material.

Human Remains

Requirements analysis began in previous seasons. Mia sent over data structures that
would form the basis of the application. The team then created an interface to allow
them to start recording their data. This fantastic work highlighted how it is within
each teams capability to create working systems within a very short space of time
using Microsoft Access as a development tool.

Images

We have worked closely with Jason Quinlan to implement processes to make
metadata about images that are catalogued in the image management application
Portfolio available via the centralised database. This idea was prototyped in the
excavation and conservation databases where functionality was added so images are
directly linked and viewable from their related records. A mechanism for exchanging
metadata between Portfolio and SQL Server has been set up to keep both systems up
to date with changes in each system.

Microfauna

The available microfauna data was centralised over the winter and the existing
interface improved and new interface requirements implemented. The centralised
system was used for the first time this season. There is on-going work on centralising
the Bach microfauna data to complete the existing dataset.

Micromorphology

An assessment was made of the work to be carried out on the existing database and
this is on-going.

Phytoliths

After an assessment of the existing recording structures in Excel and current
requirements, a database structure was designed and the data centralised. A new
interface was developed and this work continued on site.

Other Database Work

Security

A full security model for the centralised database was planned and implemented over
the winter. The permissions model allowed each team to enter and modify their own
data, and read but not modify data from other teams. This ensured that the entire team
could benefit from the data centralisation while guaranteeing the integrity of their
data.

Analysis support

In season 2005 the AllTables concept was developed to allow team members and
researchers access to all permissible data (read-only) by providing them with a file
where they could store their own queries, forms and reports. This idea was developed
further over the winter to ensure any database changes were automatically reflected in
the AllTables file. This new version was used on site this season by a number of
researchers.

Documentation

Training materials and documentation written in season 2005 were extended in
response to questions raised by the team. General IT documentation for on-site team
members was updated. Documentation work is ongoing as new databases are
developed, functionality is added to existing databases and the training needs of the
team become apparent.

The lack of documentation for functionality of existing databases had hindered our
ability to develop cross-platform solutions and to move validation and integrity to the
back-end, hopefully we will be able to build a body of documentation for previously
existing applications that will enable these modifications in future.

Conclusion

All the work done on the data centralisation would not have been possible without the
fantastic co-operation of all the teams and we are exceptionally grateful for their time,
patience and expertise.

The advantages of the centralisation process have again been highlighted this season
by the ability for different teams to share data, for example excavation data and the
flotation log, which has reduced data duplication and related errors.

The benefits of a relational database became apparent during the process of cleaning
data previously held in non-relational applications. Duplication, invalid codes and
data integrity problems that were previously undetected were discovered and
resolved.

There is still much work to do. Some datasets have yet to be integrated and the new
interfaces which are in their infancy can be further developed in the future and evolve
as recording needs change. The data published on the public website can also be
extended to make more of the centralised data available and discussions are on going
with the lab teams to release permissible records. This season has seen the benefit of
the previous years work in the way data is stored and used across teams.

Leave a Reply

Your email address will not be published. Required fields are marked *