Enabling Visual Studio Intellisense for Colectica SDK

Colectica SDK lets you easily work with the DDI data documentation standard in C#. A helpful way to learn about the different parts of the SDK is to use Visual Studio’s Intellisense. In order for Intellisense to know about Colectica SDK’s functionality, you just need to have the XML documentation files alongside with the referenced DLL assembly files.

Step 1. Locate the DLL and XML files for Colectica SDK


Step 2. Copy these files to a location near your project


Step 3. In your Visual Studio project, add references to the DLLs

When you make the references, be sure it is to the assemblies that are in the same directory as the XML files. These files include the documentation that drives the Visual Studio Intellisense feature.


Step 4. Enjoy Intellisense

After referencing the assemblies, Intellisense is now enabled.


Posted in Uncategorized | Comments

Colectica Repository’s Settings API

Colectica Repository and Portal have many settings that can be adjusted by the system administrator. These include settings such as language sort orders, synonyms, item types to display, and appears within configuration. A developer creating Colectica Addins recently asked how this API could be used for storing settings in custom extensions.

The Repository settings consist of a key associated with an optional string and long integer values. There are four web service calls that can interact with the Repository Settings store.

// Removes the repository setting.
void RemoveRepositorySetting(string settingName);

// Sets the repository setting.
void SetRepositorySetting(RepositorySetting setting);

// Gets the repository setting.
RepositorySetting GetRepositorySetting(string settingName);

// Gets all of the repository settings.
Collection<RepositorySetting> GetRepositorySettings();

The Repository Setting class looks like this

public class RepositorySetting
    public string SettingName { get; set; }
    public string Value { get; set; }
    public long? LongValue { get; set; }

Our built in settings use this key value store, normally with a json representation of our settings stored as the value. Here is an example of how we could store synonyms for DDI item types using the settings API.

// A key defined somewhere in the class
string SingularItemTypeNamesKey = "Colectica:Portal:SingularItemTypeNames";

// Add some synonyms
Dictionary<Guid,string> names = new Dictionary<Guid,string>();
names.Add(DdiItemType.DdiInstance, "Project");

// Create the repository setting to store
RepositorySetting setting = new RepositorySetting();
setting.SettingName = SingularItemTypeNamesKey;
setting.Value = JsonConvert.SerializeObject(names);

// Create the web services client and set the setting
WcfRepositoryClient client = GetClient();

Similarly, here is an example of how you could retrieve the singular synonyms.

// Create the web services client and retrieve the setting
WcfRepositoryClient client = GetClient();
var setting = service.GetRepositorySetting(SingularItemTypeNamesKey);

if (setting == null || setting.Value == null)
    return new Dictionary<Guid, string>();

// If the setting exists, serialize it
var results = JsonConvert.DeserializeObject<Dictionary<Guid, string>>(setting.Value);

Best Practices

Use a unique key name. We suggest a colon separated key in the form Organization:Product:Description. For example, MyOrg:MyCustomProject:MySettingName.

Store many settings using a single key. For example, if you are storing translations you could store all of your translations in one key with all the data stored in json or xml in the value. Our example used a dictionary. You could also use lists, a custom class, etc. This allows you to minimize the number of web service calls over the network.

Cache your settings once they are retrieved. If the settings will be used multiple times, cache them in your program so you do not have to call the web services repeatedly.

Posted in Colectica | Tagged , , | Comments

Translating Colectica

Colectica has always allowed folks to create metadata in multiple languages, but the user interface itself has so far been limited to English. Naturally, many of our customers would prefer to use the software in their native language.

The Colectica desktop applications have lots of text, so the first step was for us to pull this text out of the source code and into resource files. We are using Resx files, for those familiar with .NET. Colectica has over 750 strings, but the task of extracting these was fairly simple using tools like Visual Localizer, WPF Localization Extension, and a modified WPF Localization Addin.

User-driven Translation

Since Colectica is focused on letting people create metadata about statistical data using the DDI standard, much of the text in the user interface is very domain-specific. Instead of using generalist translators, we think the result will be better if folks who are familiar with surveys and statistical metadata help with the translations.

Given the large number of strings, we didn’t want to give people a spreadsheet with hundreds and hundreds of rows and wish them luck with the task. Instead, we created a web application to manage the translation process. Translators simply log in, pick a category of text they would like to localize, and start submitting translations.

Image of the Colectica Translator

Text is categorized by where it appears in the application. This lets translators work on only a few strings at a time, and allows everybody involved to track the progress of a translation as work is performed.

Track progress of translations

Integrating the Translations

Once a translation is complete, we just ask the web app to generate the culture-specific resource files, which we then add to the Colectica project.

Automatically generate resx files for each translation

Once we rebuild Colectica, the users can choose which language they prefer to see.

Colectica, translated

What language do you like?

In what language would you like to see Colectica? Let us know, and if you would like to help with a translation we’ll be happy to show you how.

Posted in .net, language | Comments

DDI Workshop in Budapest

The Chain Bridge between Buda and Pest

For the past year or so we’ve been advising on a German project to develop some data documentation tools for folks working with employment data. As part of the project, last month we went to Budapest to conduct a training on DDI, the standard for data documentation. We worked with the developers at OPIT, who are creating a toolset called Rogatus that works with the DDI Lifecycle standard. Last year we provided an introductory workshop to the same developers, so this year we were able to dive deeper.

We had a few goals for the workshop:

  1. Learn what’s changed in DDI 3.2 over the past year
  2. Get in-depth with DDI content
  3. Since Colectica works with the same standard, make sure Colectica and Rogatus are interoperable

Bonus: Besides accomplishing these, a side effect of having four developers spend a full week with our eyes on the DDI 3.2 schemas is that we ended up performing a pretty thorough initial review of the draft standard. We were able to submit dozens of fix requests to the DDI Technical Committee, and I’m thrilled that the fixes have been incorporated into the schema.

Changes in DDI 3.2 over the past year

The OPIT team is targeting the upcoming DDI version 3.2 instead of the current 3.1 release. They knew going in that 3.2 was a moving target, but the improved developer-friendliness of the update makes it worthwhile. Since the schemas the developers were targeting were a year old, the first day of the workshop was focused on bringing everybody up to speed on the freshly-released draft 3.2 schemas. Briefly, what changed over the past year includes:

  • All item types can now be referenced; there are no places where items must be included inline.
  • All types now have consistently-structured Groups (e.g., VariableGroup, QuestionGroup, ConceptGroup)
  • Documenting datasets is much easier
  • Specifying data types for questions and variables no longer requires an extra level of indirection with “delineations”. What were those anyway?
  • Describing missing values is simplified
  • A few of the new elements were renamed, like DataElement –> RepresentedVariable

DDI content covered at the workshop

It was nice to see the OPIT developers had DDI import and export functionality for a few of the element types already complete.  The next step was to get into the schemas and look at more of the content, implementing more import and export functionality in a pair programming environment. The OPIT developers were able to add serialization for several new areas:

  • Questions with multiple response types
  • Question grids
  • Attaching “other, specify”-style responses to code lists
  • Question blocks, which are a group of questions preceded by some stimulus, like a picture or an article
  • Survey instruments, including with conditional branches
  • DublinCore citation information

All that was successfully validated against the new DDI 3.2 XML schemas, too.

Of course a week isn’t enough time to implement everything, but we also covered other material in enough detail that the OPIT developers won’t have trouble writing the serialization for these areas:

  • Parameters and bindings in survey instruments
  • Question groups (and since they’re consistent, all other types of groups)
  • New data types, including nominal, scale, ranking, and distribution
  • Datasets, and describing the variables contained in a dataset
  • Storing summary statistics with a dataset description
  • Studies and series
  • Looping  in survey instruments
  • Packaging and organizing items (DDIInstance, ResourcePackage, Schemes, Groups, Fragments)

Yes, that sure is a lot of metadata. It was an intense and productive week.

Interoperability with Colectica

The official release of Colectica works with DDI 3.1, but we have started working on 3.2 support. As we worked on the Rogatus 3.2 serialization, we were able to test and make sure the XML output could be read into Colectica. Happily, it worked without a hitch. Things worked well in the opposite direction, too. It’s great to see two totally separate tools talk to each other, using DDI as the shared language.

What’s Next

Next week we are off to Copenhagen for another intense, weeklong Colectica and DDI workshop.

Get in touch if you are interested in hosting a DDI workshop of your own. Or if you would like to come to Minneapolis, we are always happy to have you here.

Posted in .net, DDI | Comments

Colectica Web Services Facade for Third Party Datastores

In an enterprise, there are often numerous data management systems that contain specific data for different domains. These data silos are often difficult to integrate when creating a holistic view of the data life cycle. This post will detail how to create a web services layer over existing databases that will expose DDI metadata. DDI is an open standard for documenting the data lifecycle. Using DDI, multiple data sources can be combined to create the ‘big picture’ view.

The Read Only View

The simplest way to expose DDI from an existing system is to create a Web Services facade. This facade will implement several functions that are needed to expose a data source as an ISO 11179 repository, a standard on which DDI is based. One option is to allow the existing system to perform all updates and management of its own data, while providing a read only view to other systems for integration. To accomplish this with DDI and Colectica, the following abilities should be present in the web services facade.

Viewing an Item

The most basic function of a repository is to retrieve an item. In Colectica, this will most likely be an item serialized as DDI 3. Given an ISO 11179 international registration data identifier (IRDI), the web service calls GetItem and GetItems will return a RepositoryItem object containing information about an Administered Item and its XML serialization.


An ISO 11179 repository manages multiple versions of Administered Items. The web service call GetVersionHistory can list all versions of an item in a repository.

Relationships and Search

Searching for relationships between items is needed to efficiently browse items in a hierarchy. To enable a read only view, the web services facade should implement GetRelationshipBySubject and GetTypedRelationships, and GetSet to enable relationship searching. To enable text based searching, the web services facade should implement Search and SearchTypedSet.


Often when browsing, only basic information about an item is needed for display. This often includes the item type, its identity, and a basic label. Implementing GetRepositoryItemDescriptions to provide this basic information can speed up user interactions with the web services layer.


These 9 abilities encompass all that is needed to create a read only view on top of an existing data management system. These functions also enable creation of local checkouts of the items.

  • If the system already manages items using the DDI standard this is very straight forward.
  • If the system manages data in the DDI content model but not in a DDI serialization of versioning system, a translation layer may be required for the serialization and identification beneath the web service facade.
  • If the data managed by the system is not part of the DDI content model, the data should most likely not be put behind a web services facade. It should instead be documented using the DDI standard. This includes describing variables, datasets, and concepts that describe the data.
Posted in Colectica, DDI | Tagged , , | Comments

Testing schemas for the DDI 3.2 release

The changes from the DDI 3.2 public review have been entered into the source repository, and the final review of the changes is now taking place. A main focus on version 3.2 is consistency and usability, and the Technical Committee came up with a list of design and content guidelines to ensure this. This focus on consistency should allow users and developers to more quickly adopt DDI Lifecycle since all the content areas should now be programmatically usable in the same ways.

Check out an example report on the current DDI 3.2 development schemas

During our review of 3.2, we have created a tool to point out items in the DDI schema set that do not conform with these consistency guidelines. The tool analyses the schemas and creates an html report of items that should be addressed before release. It currently performs the following checks.

  • Validate schema set is DDI Lifecycle.
  • Check compilation of the schema as an XML Schema Set.
  • Versionables and Maintainables allowing inline or reference usage.
    • Versionables and Maintainables are in a xs:Choice.
    • Versionables and Maintainables in a xs:Choice contain two elements.
    • Versionables and Maintainables in a xs:Choice contain a xxxReference.
  • FragmentInstance contains all Versionables and Maintainables.
  • Type of Object for references
    • Duplicate Element names detected for referenceable types.
    • Element names detected without a TypeOfObject defined.
  • Spell checking
    • Element names
    • Attribute names
    • XSD annotations/documentation
    • Breaking apart CamelCasedWords
    • Allows words to be added to dictionary
    • Uses en-US
    • Highlighting of misspellings in generated reports.

In addition to checking the structures in the schema, the tool also does a spell checking of all elements, attributes, and inline documentation to make sure that the released DDI has a professional feel. You can see an example report on the current DDI 3.2 schemas progress towards the consistency goals!

lgplv3We have licensed the tool as Open Source under the LGPL and the code is available for download and forking on GitHub at https://github.com/DanSmith/DDISchemaCheck/.

There is also a release of the compiled tool on the releases page. Please email the DDI users list, send a tweet, or send us pull requests with any additional tests that you would like to see incorporated.

Posted in Colectica, DDI | Tagged , | Comments

Retrieving DDI items from the Colectica Repository

I received a followup question to my post about registering 11179 items in the Colectica Repository. This question involves working with the Colectica SDK and its DDI model in conjunction with the Repository.

How do I connect to the Repository and retrieve a DdiInstance, such as the YourDdiInstance() method in your previous post?

First we will create the repository client. In this example we will use the built in Active Directory authentication and send the credentials of the user running the program (The user who asked the question uses the Active Directory authentication and roles). Notice the username and password are not specified as they were in the example from my previous post.

// Create the web services client
var client = new WcfRepositoryClient("localhost", 19893);

If we know the item’s identification, we can retrieve the item. If not, we can perform a search on the repository. The basic GetItem has many variations with different processing options, retrieving item lists, and sets of relationships. The simple GetItem and GetLatestItem are shown below.

// Get an item by 11179 identifier
IVersionable item = client.GetItem(id, agency, version);

// Or get the latest version
item = client.GetLatestItem(id, agency);

DdiInstance instance = item as DdiInstance;

To make it extremely easy to work with DDI items in the Colectica Repository, we will wrap this client with additional methods using the DdiClient. This also avoids the type checking and casting if you want to access properties of the DdiInstance not present on IVersionable. There are also similar methods for each DDI item type as the one shown below!

// Wrap the web services client
DdiClient ddiClient = new DdiClient(client);

// Get the Ddi Instance
DdiInstance instance = ddiClient.GetDdiInstance(
  id, agency, version, ChildReferenceProcessing.Instantiate)

The client calls allow controlling how child items are populated. If we have an unpopulated DdiInstance, we can use a similar method call to fill it with data and find its children.

// an unpopulated item with its identification. Children items 
// may come back  from the client as unpopulated depending on 
// the child processing that is selected. Here is an example of 
// how to populate such an item with the client

DdiInstance instance = new DdiInstance() 
  Identifier = id, 
  AgencyId = agency, 
  Version = version,
  IsPopulated = false 

// Populate the Ddi Instance
  false, ChildReferenceProcessing.Instantiate);

// Or as shown in Update 1 of my other post, populate the entire 
// item hierarchy from the Repository
GraphPopulator populator = new GraphPopulator(client);

// Do something with the instance
foreach(StudyUnit study in instance.StudyUnits)


Posted in Colectica | Tagged , , , | Comments

DDI 3 meets RDF and SPARQL with Colectica Repository

With the release of DDI version 3 (DDI Lifecycle), an effort was made to allow reuse and linkages throughout the content model. This created a rich model that allows for reuse and harmonization of metadata items through the use of referencing. When using the DDI 3 Addin for Colectica Repository, all of these relationships between metadata items are indexed and allow for the rich interlinkages of items as seen on Colectica Web. With all of this relationship information, wouldn’t it be nice to execute arbitrary queries about the relationships? Colectica Repository already offer relationship and set based searching for registered metadata items, but a more powerful interface has now arrived.

Colectica Repository RDF Services

SPARQL is a query language created for searching RDF data and is standardized by the W3C. It allows for searching based on the relationships and literal data stored in an RDF graph or store. Colectica Repository now offers a new Addin with the ability to query DDI 3 as RDF using a SPARQL endpoint on Colectica Web and from the Repository with a web service! In addition, each DDI 3 item stored in the Colectica Repository can be downloaded in RDF using a Concise Bounded Description.

How are the RDF Services implemented?

The RDF Services is a new optional component for Colectica Repository. Colectica Repository has many extension points created with the help of the Microsoft Managed Extensibility Framework. This allows custom Addins to be created and deployed by dropping a new assembly into the Addins folder. The DDI 3 Addin uses the Item Format extension point and many customers are already familiar with it. Another extension point is the Post Commit Hook. The RDF Services are implemented as a post commit hook and query for RDF serializers for the item using MEF. They then stores the RDF serialization of the DDI 3 in the Colectica Repository.

RDF Examples

I will show some examples from the US 2010 Census sample DDI 3 example file. The following is the RDF serialization of question 6 as a CBD which has a coded classification and question text in several languages.

@base <http://data.colectica.com/item/us.colectica/ba540279-8bfc-461c-a07b-d25493c648a7/31>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ddi: <urn:ddirdf:>.
@prefix ddit: <urn:ddirdf:type:>.
_:autos1 a <ddit:CodeDomain>;
         <ddi:HasCodeScheme> <http://data.colectica.com/item/us.colectica/e2648604-e1a5-4a75-8b0e-52a5c13dd89c/13>;
         <ddi:ResponseDomainBlankIsMissingValue> false.

<http://data.colectica.com/item/us.colectica/ba540279-8bfc-461c-a07b-d25493c648a7/31> <http://purl.org/dc/elements/1.1/title> "Q6"@en-us;
    a <ddit:Question>;
    <ddi:AgencyId> "us.colectica"^^xsd:string;
    <ddi:EstimatedTime> "PT0S"^^xsd:dayTimeDuration;
    <ddi:HasCodeDomain> _:autos1;
    <ddi:HasCodeSet> <http://data.colectica.com/item/us.colectica/e2648604-e1a5-4a75-8b0e-52a5c13dd89c/13>;
    <ddi:Id> "ba540279-8bfc-461c-a07b-d25493c648a7"^^xsd:string;
    <ddi:QuestionIntent> "Asked since 1790. Census data about sex are important because many federal programs must differentiate between males and females for funding, implementing and evaluating their programs. For instance, laws promoting equal employment opportunity for women require census data on sex. Also, sociologists, economists, and other researchers who analyze social and economic trends use the data."@en-us;
    <ddi:QuestionText> "¿Cuál es el sexo de la Persona 1?"@es,
          "Jinsia ya Mtu wa 1 ni ipi?"@sw,
          "Quel est le sexe de la Personne 1 ?"@fr,
          "Seksi i Personit 1?"@sq,
          "Was ist das Geschlecht von Person 1?"@de,
          "What is Person {PersonCounter}'s sex?"@en-us;
    <ddi:UserId> "Colectica:UserAssignedId:Q6"^^xsd:string;
    <ddi:Version> 31 ;
    <ddi:VersionDate> "2011-02-09T10:11:14"^^xsd:dateTime;
    <ddi:VersionRationale> "Publishing study"@en-us.

Question 5 shows the use of multiple response domains.

@base <http://data.colectica.com/item/us.colectica/6fc6291a-b9c1-4698-83f8-3983c2ec8cb4/28>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix ddi: <urn:ddirdf:>.
@prefix ddit: <urn:ddirdf:type:>.

_:autos1 <dc:description> "Apellido"@es,
           "First Name"@en-us,
           "Jina la Mwisho"@sw,
         <dc:title> "Apellido"@es,
         "First Name"@en-us,
         "Jina la Mwisho"@sw,
         a <ddit:TextDomain>;
         <ddi:ResponseDomainBlankIsMissingValue> false.

_:autos2 <dc:description> "Emri"@sq,
           "Jina la kwanza"@sw,
           "Last Name"@en-us,
         <dc:title> "Emri"@sq,
         "Jina la kwanza"@sw,
         "Last Name"@en-us,
         a <ddit:TextDomain>;
         <ddi:ResponseDomainBlankIsMissingValue> false.

_:autos3 <dc:description> "Emri i dytë"@sq,
           "Herufi ya Kati"@sw,
           "Initiale 2e prénom"@fr,
         <dc:title> "Emri i dytë"@sq,
         "Herufi ya Kati"@sw,
         "Initiale 2e prénom"@fr,
         a <ddit:TextDomain>;
         <ddi:ResponseDomainBlankIsMissingValue> false.

<http://data.colectica.com/item/us.colectica/6fc6291a-b9c1-4698-83f8-3983c2ec8cb4/28> <dc:title> "Q5"@en-us;
    a <ddit:Question>;
    <ddi:AgencyId> "us.colectica"^^xsd:string;
    <ddi:EstimatedTime> "PT0S"^^xsd:dayTimeDuration;
    <ddi:HasTextDomain> _:autos1,
    <ddi:Id> "6fc6291a-b9c1-4698-83f8-3983c2ec8cb4"^^xsd:string;
    <ddi:QuestionIntent> "Listing the name of each person in the household helps the respondent to include all members, particularly in large households where a respondent may forget who was counted and who was not. Also, names are needed if additional information about an individual must be obtained to complete the census form. Federal law protects the confidentiality of personal information, including names."@en-us;
    <ddi:QuestionText> "¿Cuál es el nombre de la Persona 1?"@es,
                  "Jina la Mtu wa 1 ni lipi?"@sw,
                  "Quel est le nom de la Personne 1 ?"@fr,
                  "Si quhet Personi 1?"@sq,
                  "What is Person {PersonCounter}'s name?"@en-us,
                  "Wie lautet der Name von Person 1?"@de;
    <ddi:UserId> "Colectica:UserAssignedId:Q5"^^xsd:string;
    <ddi:Version> 28 ;
    <ddi:VersionDate> "2011-02-09T10:11:14"^^xsd:dateTime;
    <ddi:VersionRationale> "Publishing study"@en-us.

SPARQL Examples

Lets take a look at some SPAQRL queries that I can run across the DDI 3 RDF stored in the Colectica Repository. The first one I will look for studies that I have created since January 2010.

PREFIX ddi: <urn:ddirdf:>
PREFIX ddit: <urn:ddirdf:type:>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?study
    ?study a ddit:StudyUnit;
    dc:date ?creation_date;
    dc:creator <http://dan.smith.name/who#dan>.
    FILTER ( xsd:dateTime(?creation_date) > "2010-01-01 00:00:00"^^xsd:dateTime ) .
ORDER BY ?study

This second query will give us a count of how many times a variable has been reused/harmonized across datasets.

PREFIX ddi: <urn:ddirdf:>
PREFIX ddit: <urn:ddirdf:type:>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?variable COUNT (?parent) AS c
    ?parent ddi:HasVariable ?variable .
    ?parent a ddit:Dataset
GROUP BY ?variable

Next Steps

Colectica Repository DDI 3 RDF Services is now available as a community technology preview to interested customers. There are several items to take note of while using it:

  • There is not yet an official DDI RDF vocabulary. The namespace and names of elements may change in the future.
  • Several external vocabularies are also currently used. These include rdf, rdfs, dc, dcterms, owl, xsd, and foaf.
  • Per metadata item level ACLs on items in the Colectica Repository are not yet implemented in the SPARQL interface. This means that all metadata items will be in a read state to all users, Deploy accordingly.
  • SPARQL UPDATE is disabled to maintain consistency with the versioned metadata items in the Repository.
  • You can set the location of your Colectica Web installation in the RDF Services so that the generated URLs for items are resolvable in the browser. This allows users to see a nice web based view of the information.

Feedback is welcome!


Posted in Colectica | Tagged , , | Comments