#
Welcome to CIIM
Knowledge base for all things CIIM related
If you need help or have any questions, feel free to contact us at support@k-int.com.
#
What is CIIM?
The CIIM framework is a suite of open-source middleware and API components and was originally created in 2009 for the London Museum (formerly Museum of London). Since that time it has been continuously evolving and developing and now provides the reference middleware implementation for the cultural sector and is in use at a wide range of institutions worldwide.
This highly scalable, fully automated ingest framework comfortably handles over 80 million rich records and seamlessly integrates with 30+ collection, archive, library and digital asset management systems.
CIIM features a schemaless data model and is ingest format agnostic, with a flexible processing workflow and field level data enrichment.
AI integrations, deep zoom/IIIF media and manifest handling, automated publication, media artifact creation/watermarking and optimized search and retrieval across large datasets are only some of the areas in which the CIIM excels. A comprehensive data auditing strategy is also built in, ensuring data integrity across all aspects of ingest, processing and data/media publication.
#
Who is CIIM for?
Although originally developed for the cultural heritage sector - e.g Galleries, Libraries, Archives, Museums and Cultural Aggregators - it is also applicable across a wide range of other sectors wherever making sense of large amounts of disparate data is required.
Our existing customers include:
#
Why choose CIIM?
An apparently ‘simple’ problem to solve, the CIIM answers the data integration, audit and sharing questions that most other solutions/frameworks haven’t even thought of asking yet.
First and foremost CIIM is not a bespoke solution, but a versatile framework using plugins and extension points to make customisation easy.
It is built on over 15 years of data analysis, use-cases and differing institutional requirements and data models. As an open source product, it continuously evolves based both on user requirements and emerging technologies, with all new features accessible to all users.
#
CIIM Product Lines
CIIM consists of six powerful applications and libraries, each designed to meet different business needs and enhance the overall experience.
- CIIM Core
- CIIM Management Interface
- Rosetta - API Gateway
- Proteus - A data transformation language built on top of JSONPath
- Cantaloupe - A build of the latest development version by ourselves at Knowledge Integration
- Harvey - Our 'phone home' monitoring service
#
Features
We are passionate about innovation and committed to keeping our customers ahead in a fast-evolving digital world. Our dedicated team focuses on continuous product development, regularly adding new features, and fostering an active user community for sharing ideas and experiences.
#
Core
- Highly scalable, proven to handle >80 million records (responsible for over 150 million resources worldwide)
- Fully automated, schedulable and extensible ingest framework
- Integrates with over 30 major collection, archive, library and digital asset management systems - as well as collections related systems (e.g. diary, special collections access)
- Completely schemaless model, no data gets excluded - ie all fields available for searching/filtering
- Represents any type of metadata - examples range from GLAM metadata to archival TT race, course and rider data, human skeleton and skeletal pathology and theatrical productions.
- Unify multiple collections in optimised for search and retrieval whilst retaining the full richness of all of the source data (i.e. no lowest common denominator model)
- Rule-based augmentation and editing of records
- Validation of data integrity / quality and record conformance with both predefined and customisable rules
- Graph representation of relationships
- Suite of AI integrations for media and metadata
- Target specific fields for specific services
- Automatically re-analyse changed fields/records
- Automatically analyse new records which match defined rule sets
- Graduated publication rules for internal and external delivery
- Write-back to source systems (e.g. collections metadata to DAMS)
- Automatic media asset generation/resolution, including creating pyramid tiffs for deep zoom/IIIF
- Optional watermarking
- Optional metadata embedding
- Automated publication processes to multiple ecosystems and endpoints (e.g. Elasticsearch, media server, SOLR, triple store, graph data store)
- Full auditing of record processing lifecycle
- Full auditing of what was extracted/processed when and where it was published to.
- Handle cross system references using arbitrary metadata fields.
- Not just ‘object’ centric - manages authority and media data and all relationships to ensure combined records (in for example Es index) are updated in response to related graph changes at both relationship and metadata level
- Internal audit of all extraction, processing and publication operations to ensure data integrity
- Manage hierarchy integrity for hierarchies of many millions of records
- Flag cyclical hierarchies
- All CIIM core functions available/controllable via internal API
- Author new records through submission to internal API
- Analytics integration
- Full data change history
- Add additional information to enrich searchable data (additional vocabularies, Tags, search hints) to aid retrieval/increase search precision. Keep this in sync with any new records that match the defined searches and remove from any records that no longer match
#
Management Interface
- Secure OAuth2 based authentication (using Keycloak - the industry leading identity and access management solution - compatible with institutional authentication systems - e.g. Azure AD)
- Graduated feature access controlled through combinations of Roles and Groups
- Data dashboard to visualise and explore auditing of data extraction processes
- Fine-grained control of extraction and publication schedules for media and metadata
- Intuitive and fully customisable advanced cross collection searching
- Sort, filter/aggregate, search any field (or group of fields) in your data - set up your own shortcuts
- Advanced Lucene search syntax for expert and technical users
- Fully customisable templating system to present collection,archive, authority records and digital assets
- Unlimited rule-based template views for different data types and purposes
- Instant takedown of problematic or controversial records from public endpoints
- Lock records to allow online updates to happen at specific times to coincide with exhibition openings/press releases.
- WCAG compliant
- Explore data in internal and external indexes
- View records in their originating structure and format from their source system api
#
Rosetta
- Highly flexible metadata delivery application
- Create any number of API endpoints through configuration only
- Dynamic,high speed transformation of Elasticsearch responses using Proteus language
- Remove known performance ‘bottlenecks’ with Elasticsearch (e.g. high-cardinality aggregations)
- IIIF (media and presentation)
- GraphQL
- OAI
- Linked Art
- Generate sitemaps for dynamic content
- Bespoke APIs for user-interface integration/presentation
- Swagger/OpenAPI documentation
- Compatible with Kong enterprise API gateway + Apache
- Export to CSV/JSON/Elasticsearch/S3
- Distributed to work at scale
#
Proteus
- JSON to JSON transform/templating framework
- Dynamically resolves values from input record using JSONPath
- Extensible library of logical components (for each, switch, chaining etc.)
- Define and invoke reusable named template snippets
- REST API for testing Proteus scripts
- HTTP requests in-line
- Custom XML/CSV to JSON conversion
- Predicate matching
- File operations
- Date parsing/formatting
- Encryption
- Maths expressions
- Regex
- Caching
#
Cantaloupe
- AWS role integration
- Graduated image size access based on IP and/or roles
- Image sizes either defined in config or based on media record attributes