2025-0155 Big Data AI Tech - Raw Data to SA (NS) NETHERLANDS - 11 Jun

2025-0155 Big Data AI Tech - Raw Data to SA (NS) NETHERLANDS - 11 Jun

Contract Type:

Contractor

Location:

The Hague - The Hague, Netherlands

Industry:

NATO

Contact Name:

Tim Lane

Contact Email:

tim@plr.ltd

Contact Phone:

Tim Lane

Date Published:

29-May-2025

Deadline Date:  Wednesday 11 June 2025
 
Requirement:  Big Data and AI Technology for Raw Data to Searchable Archives - Data Processing Pipeline
 
Location:  The Hague, NETHERLANDS
 
Time On-Site:  100%
 
Not to Exceed:  2025 BASE: NTE 12,645 Euro/Sprint (7 sprints, total NTE 88,515 Euro), 2026 OPTION
 
Period of Performance:  2025 BASE: 30 June 2025
 
Required Security Clearance:  NATO SECRET
 
Introduction:

  • The NATO Information and Communication Agency (NCIA) located in The Hague, Netherlands, is currently involved in processing vast amounts and highly variant data coming from theatre for the purpose of efficient archiving.
  • In light of these activities, within NCIA Chief Technology Office, the Exploiting Data Science and Artificial Intelligence (EDS&AI) team is tasked to apply Big Data and AI technology to prepare, run and adjust processing pipelines for processing various source data into archiving formats and metadata, and prepare for (semantic) search.
  • NATO has an obligation to support national investigations into situation that occurred in theatre. In order to support the different teams involved most optimal, the EDS&AI team brings the expertise to extract and exploit the vast and varied data on the table, by using the Agency’s high performance computing classified sandbox.
  • The EDS&AI team provides the core data science skills and technology needed for big data analysis and AI.
  • The EDS&AI team applies innovative technology to data whenever it is not possible to extract value with conventional approaches.
Objectives:
This Statement of Work (SOW) describes the work necessary to provide specific AI and Data Exploitation activities – which the NCIA CTO/EDS&AI team provides – for processing raw data from theatre to searchable archives. The services described below will be provided to the NCIA CTO/EDS&AI team, as they deliver specialised Data Science and AI results to their stakeholders in NATO Headquarters and NATO Allied Command Operations.
 
Overarching objectives:
  • make required documents from theatre accessible and searchable by archivists during execution
  • capture document contents into long term preservation formats
  • capture Functional Area System (FAS; back-up) contents into long term preservation formats
  • identify (and remove) duplicate documents, records of temporary value and non-records that are not required for archiving
  • provide (interim/final) data reports describing actions and results
Scope of Work:
Under the direction of CTO-EDS&AI, the contractor will execute 4 week sprints which cover, in line with the overarching objectives, the following:
  • Setting up / improving pipelines to process all required documents and that uniquely identifies and traces decisions and processing steps. This is to be conducted on the provided classified sandbox environment, with provided performance hardware and toolsets.
  • Implementing / improving (missing) pipeline steps for marking duplicate files, based on file attributes, path (structure) and content (similarity), and rules for considering a file or structure a duplicate.
  • Extracting document-format records from Functional Area Systems (FAS) databases and back-ups performed otherwise. Archiving SME’s and system SME’s are available for guidance on target formats and source system structure and data interpretation. Each FAS is processed separately; not all sprints touch upon this item.
  • Processing / Monitoring progress of various office, image and video file types to the accepted archiving formats, including extraction of metadata and preparing search semantic indexes.
  • Automating registering all processed documents with semantic indexes with the sandbox natural language search tool.
  • Automating the final copy of all non-duplicate and extracted archive documents with content and metadata to the NATO archiving system.
  • Reporting status, progress and statistics of the (raw) files being processed to archive formats, metadata and search indexes.
  • Delivering full reporting of results, trace of pipeline steps taken and (stakeholder) accepted failures.
  • Quarterly updates.
Not all listed items are expected to occur in all sprints. In general most items will translate to a build (new pipeline/processing step), execute (reported progress on data batches), improve (optimized or corrected pipeline or processing step) or monitor (check on logs and progressing statistics) action in most sprints.
Orchestrating pipelines are expected to utilizing KNIME. Reporting efforts are expected to target Microsoft Power BI dashboards. GitLab is expected to be used for source code management and documentation.
The content, scope timelines and acceptance criteria of each sprint will be agreed with the service delivery manager during the sprint-planning meeting, in writing.
 
Security:
The services required to be provided through this SOW require a valid NATO SECRET security clearance prior to the start of the engagement.
 
Constraints:
  • All the documentation provided under this statement of work will be based on NCI Agency templates and/or agreed with the NCIA service delivery point of contact.
  • All support, maintenance, documentation and required code will be stored under configuration management and/or in the provided NCI Agency tools.
Practical Arrangements:
  • The contractor is expected to provide services on-site at NATO Communications and Information Agency, The Hague, The Netherlands.
  • The services shall be provided during normal office hours following the on-site location calendar.
  • He or she will provide services under the direction and guidance of the CTO-EDS&AI or their designated representative.
  • ONE contractor must accomplish this work. In the event the contractor leaves during the contract period, a new contractor who has the proven required qualifications and is evaluated qualified and suitable shall replace them. All normal AAS+ Framework Contract Terms and Conditions apply.
Specific Requirements:
  • At least 3 years’ of practical experience in the field of data science and/ or data analytics;
  • Experience using data processing/visualization/analytics software packages and development environments, preferably such as KNIME, VS Code, GitLab, Power BI, Jupyter Lab, and Docker-based API;
  • Experience with data processing Big Data, creating and utilizing containerized building blocks and running containers (APIs) on Kubernetes clusters;
  • Experience with programming/scripting in languages like Python, R, SQL and working with data formats like CSV, XML, JSON;
  • Experience performing content extraction from files/databases/systems, (LLM-based) embedding models, entity-extraction, key-word-extraction and content similarity measures;
  • Creative, flexible and pro-active overcoming obstacles;
  • Good drafting, communication and presentation skills in English, including technical and non-technical levels;
  • High attention to detail and accuracy;
  • Valid NATO SECRET Security Clearance.
Educational Qualifications:
  • Master in Computer Science, Engineering or relevant field.
  • A higher degree in Data Science is preferred.

APPLY NOW

Share this job

Interested in this job?
Save Job
CREATE AS ALERT

Similar Jobs

SCHEMA MARKUP ( This text will only show on the editor. )