Scenario Analysis


Scenario planning is a business analysis tool which helps determine if a scenario is plausible. Indicators are identified to help determine the likelihood of specific events occurring. A scenario analysis requires a focal question that describes our scenario. For the purpose of this analysis and in the context of the research question, the focal question is: What are the benefits of OD?

The following is a list of uncertainties about future outcomes and potential value of OD, as reflected by the presumed benefits of OD identified by the GC. These are uncertainties of how OD might benefit Canadians:

  1. Supports innovation
  2. Leverage’s public sector information to develop consumer and commercial products
  3. Enables better use of existing investment in broadband and community information infrastructure
  4. Supports research
  5. Supports informed decisions for consumers


It is likely that OD will support innovation and research but it is difficult to measure. The mere process of releasing OD will in turn help leverage public sector information; private sectors and governments will benefit from the release of OD. The two most uncertain and important excerpts remain the support of innovation and research. This is aligned with the survey result where 96% of participants agreed that innovation was very important for Canada. In addition, there are still uncertainties to how OD might support research and innovation. The previous diagram represents the importance and uncertainty of each presumed benefit.


Scenario Narratives
The previous diagram provides a graphical representation of four possible outcomes within this scenario framework. By taking the two most important but uncertain excerpts, the support of innovation and the research, we can build a plausible story narrative that will explain how the support of innovation and the research will provide benefit from the current state of affairs to a future described by each of the following quadrants.

Scenario A: Spur economic growth and value to research
In this scenario OD informs and supports research and private sectors with valuable data and in turn helps research and development to increase Canada’s competitive ability in the world market. New markets for open data are created through the development of solutions and novel ideas. Collaboration is heightened and governments are viewed as transparent. Innovation provides knowledge that informs strategic directions for governments and industries. Research and development is enhanced, rapid advancements and developments are created, and research outcomes become a catalyst of growth in Canada. OD enables the creation of new markets and services. A new demand for knowledge is created which creates a need for new enterprises. The sharing of data is in high demand and new companies are created to meet the demand for new products and services.

Scenario B: Spur economic growth but no value to research
In this scenario, OD enables the creation of new markets and services. A new demand for knowledge is created which drives new enterprises. The sharing of data is in high demand and companies are created to meet the demand for new products and services. Meanwhile, OD does not support research and newfound knowledge is not obtained. If research and development efforts do not enhance the advancement of new and innovative products and services, research will not become a catalyst of growth in Canada and the economy will eventually suffer. Even if the sharing of public data heightens the economy, it will not have a long-term effect on the economy if OD does not reinforce research. The creation of new markets will be temporary if the economy is not supported by enhanced research and development.

Scenario C: Value to research but no economic growth
In this scenario OD informs and provides the research community and private sector with valuable data but does not help increase Canada’s competitive ability in the world market. OD does not support innovation and economic growth and new products and services are not created. The development of solutions and novel ideas do not enable the creation of new markets and services. Research and development does not enhance advancements and innovation does not become a catalyst of growth in Canada.

Scenario D: No economic growth and no value to research
This scenario is highly unlikely given the value of shared information. In this scenario OD does not support research and private sectors with valuable data. It does not help economic growth, nor does it increase Canada’s competitive ability in the world market. New markets are not created through the development of solutions and novel ideas. Furthermore, collaboration would not improve and governments would not be viewed as transparent. Innovation does not become a catalyst of growth in Canada.

The plausible outcome of this analysis is determined by the probability of a specific scenario occurring. The likelihood that OD creates and supports innovation and research is very high. We have already identified the academic community as primary users of OD and the private sector is increasingly using OD to bolster innovation. These are the users that have the skills and tools to aggregate data and create added value from their in-depth analysis of the data, which will in turn provide heighten information, knowledge and wisdom to society. The release of OD is likely to create benefits for these groups and in turn provide them with additional resources to complete other relevant tasks.

Research Methodology


In order to identify the social, economic and environmental benefits of shared data from public organizations, an analysis of both primary and secondary research comprised mostly of quantitative information was conducted. The original intent of this research was to conduct a cost-benefit analysis, but after further assessment, the need to determine the return on investment for governments became a moot point since most organizations are mandated to share Public Sector Information (PSI). Identifying costs of publishing OD will not help identify benefits. The focus of the analysis will be to determine and categorize the various benefits that OD can provide. Similar to qualitative evaluation methods used to analyse human services programs, this research methodology will attempt to identify social benefits of public service programs.

Open and accessible data can sometimes include OD published by private organizations. For the purpose of this paper, references to OD or primary data will include data accessible from public organizations while secondary data originates from external parties. In addition, references to internal stakeholders consist of individuals from a public organization while external stakeholders refer to individuals representing academic, private, for-profit or not-for-profit organizations or associations.

Primary Research
A survey targeted to individuals involved or working with OD was conducted to help establish economic and social benefits. Individual interviews with representatives from public organizations were conducted to identify challenges with publishing OD while interviews with external experts involved or working with OD were also conducted to help identify challenges and benefits of using OD. Furthermore, a focus group session was also conducted with external experts involved or working with OD.

Literature Review and Secondary Research
An initial review of academic literature was carried out, focusing on research knowledge of public organizations, shared data information practices, technology trends and changes related to public sector information. A review of secondary research was also conducted on government transparency and public sector information including open data, open access, open source, open government and trends to open and accessible information.

Information gathered for this research is comprised mostly of qualitative information. This required a research method that was qualitative in nature and capable of understanding the phenomenon and the behaviors towards the OD movement. This paper uses a scenario analysis to determine the likelihood of specific events occurring. In addition, this methodology can only demonstrate the benefits linked to the data being accessible and not the quality or value of the data itself.
There are limitations to the design employed in the current study because a scenario analysis attempts to determine plausible outcomes. Furthermore, sample restrictions from a limited scale of the population and few longitudinal measurements from different levels of governments can also affect the outcome. Despite its limitations, a scenario analysis was used because it analyses future events by considering possible alternatives, which determines the impact and benefits of OD.

The three methods used for collecting primary data included: a survey of potential individuals involved with Open Data, a focus group session with external participants and interviews with both internal and external informants. The survey and interviews included quantitative questions that focused on dollar figures and performance indicators and qualitative questions focused on opinions, viewpoints and trends. Participants completed the survey within 20 minutes; the focus group session and most interviews lasted one hour.

Information was first collected and analysed with the survey. The outcomes of these analyses were then validated with the focus group session and interviews. The survey was presented in an online format using the “SurveyMonkey” platform; it included 22 questions, which are listed in Appendix 1. The survey was distributed to 42 specific groups or organizations working with OD, and invitations to participate in the survey were sent to all individuals involved with these groups. The survey was open for a period of seven weeks and 123 participants responded. The results were collected and analysed in order to inform further research.

The interviews included predefined, open-ended questions that allowed for improvised responses which generated additional questions. A list of the predefined questions can be found in Appendix 2 and Appendix 4. This unstructured format was ideal for qualitative questions and allowed further discovery of individual viewpoints. Interviews were conducted face-to-face or by phone; each participant signed a consent form and hand written notes were taken. Five interviews where completed with internal stakeholders while five additional interviews were conducted with external stakeholders.

The objective of the focus group session was to generate a conversation and a diverse set of options derived from specific challenges with OD. The session was conducted face-to-face and included three participants. The process consisted of exploring ideas in a divergent manner consistent with the Creative Problem Solving (CPS) model for generating novel ideas to address specific challenges. Two challenges were presented to the group in the form of a question; these questions can be found in Appendix 3. The informal setting and the divergent conversations allowed participants to share stories and engage in further discussions on the topic of OD.

The topic of open and accessible government information is a profound issue for several advocacy groups, which included a wide range of external stakeholders. Using qualitative research methodology, these non-random samples of the population were not randomly selected, and in turn provided useful insight into the phenomenon of OD.

Specific groups of participants were targeted for each method of collecting primary data. Participants for the survey consisted of individuals involved or working with OD from any country and sector including academic, public, private, for-profit or not-for-profit organizations. Participants of the focus group also included individuals involved or working with OD but only from academic, private, for-profit and not-for-profit organizations. Interviews were conducted with two types of participants. The first included specific individuals from public organizations who were responsible for publishing OD. The second were individuals utilizing OD from private, for-profit and not-for-profit organizations has showed in the following diagram.

Collection method






Involved or working with Open Data

Academic, public, private, for-profit or
not-for-profit organization or associations



Focus group session

Involved or working with Open Data

Academic, private, for-profit or
not-for-profit organization or associations

Federal, Provincial or Municipal


Interview (Internal)


Responsible for publishing Open Data

Public organizations

Federal, Provincial or Municipal


Interview (External)


Involved or working with Open Data

Academic, private, for-profit or
not-for-profit organization or associations

Federal, Provincial or Municipal


Data Analysis
In order to determine plausible outcomes derived from OD, a scenario analysis was conducted. A scenario analysis identifies the likelihood of a specific event occurring, and in turn determines its impact on the benefits of OD. This provided a deeper examination of obvious issues and patterns related to OD – increasing the importance of social issues.

Data source

Collection method



Primary sources

Front-line stakeholders from public and
private sector



Secondary sources

Existing academic literature


Online survey




Focus group session


Review of relevant document
literature and secondary research

Scenario analysis



Contingent valuation method



The analytic tools used to analyse the data collected from the survey included the online tool available from “SurveyMonkey” and Microsoft Excel. Data was transferred from the survey to Excel spreadsheets allowing for further analysis by aggregating data results from several sources.

Research Findings – Secondary Research


The following is a brief summary of the secondary research for this paper. In the context of the research question, the review touches on elements of Information Management (IM) practices within the GC, data standards and recognized benefits of sharing Public Sector Information (PSI).

Information Management
In 2010, Library and Archives Canada (LAC) conducted a formative evaluation to determine government accountability regarding policy, standards and directives. Key informants were interviewed including 39 representatives from various GC departments and agencies. The results of the evaluation highlighted resourcing as a main obstacle to implementing IM practices.

The Treasury Board of Canada Secretariat is leading the efforts to strengthen the management of electronic information and in 2011, the Secretariat conducted a government-wide audit of electronic recordkeeping practices in large departments and agencies. The audit found that organizations are at risk of not effectively identifying and retrieving information needed for effective decision making. This risk is caused by exponential growth of electronic information outpacing internal resources for information management.

In addition, the Canadian government has been criticized for its performance in responding to access to information requests.,, Nearly half of the requests submitted to the Government of Canada exceed the thirty day limit prescribed by the Access to Information Act. Furthermore, improvements to the dissemination of OD in response to these requests could lead to more cost-effective practices.

The European Commission is working on an initiative for a common standard that will facilitate the cross-referencing of data and interoperability in order to provide greater benefits to users. A working group has been assembled, consisting of data experts from public and private organizations across 20 countries including Australia and the United States (US). In addition, the Open Data Institute is actively crowdsourcing to develop criteria that will help organizations assess the value of possible datasets. Meanwhile, Natural Resources Canada (NRCan) has implemented and adapted several geospatial standards, guidelines and best practices which allow applications and systems to effectively operate with each other.,

Recognized Benefits
Since February 2012, Statistics Canada has been disseminating its data at no cost; information from their CANSIM database and census data is now OD. This has important social and economic benefits, especially for small organizations, research initiatives and not-for-profit organizations which could not afford the cost of accessing these datasets in the past.

OD can also be used to prevent fraudulent activities. A review of questionable expense claims and receipts released as OD has helped the Canada Revenue Agency save $3.2 billion in tax receipts claims, which were disallowed. In addition, the OD data provided by Environment Canada (weather, air and water quality data) helps inform citizens and identify areas with climate issues.

The US government has also embarked on similar initiatives where the release of weather data has “benefited the American people and contributed to economic growth and jobs.” Additionally, the government has created mobile and web applications which provides a directory for community clinics to help citizens gain easier access to locations near them.

Meanwhile, the Danish government estimates that making basic data open and freely accessible will save the public sector $45 million per year. Likewise, the Danish private sector is estimated to save $87 million by reducing the cost of acquiring data, improving public services and adding opportunities for new digital products and services.

The British Government has reduced costs through the prevention of fraudulent activities and internal operating efficiencies with dissemination of internal PSI. A recent case study demonstrates how data sharing, both within and between departments, could save $65 billion through the internal and effective use of PSI.

Other benefits can be achieved when governments or private companies re-publish OD in the form of web or mobile applications. Recollect, a Vancouver company, developed and implemented a standard for managing garbage and recycling related data to remind citizens about their local collection days; this system has proven successful in several cities.

Applications that can provide useful and beneficial information to citizens can range from services that include border crossing waiting time, duty calculators, environmental and land conditions, transportation and bus information, tourist and recreational information, vehicle recalls, currency converters, property taxes, vaccine clinics, health products and locations of external defibrillators.,,

Research Findings – Primary Research


The following is a summary of the primary research for this paper. In the context of the research question, the review touches on the characteristics of the participants that used OD, their motivations and issues surrounding the use of OD in project delivery.

Online Survey
The survey included three series of questions. The first referred to information about the participants, the second examined their involvement with OD, and the third asked questions related to specific OD projects. Requests to complete the survey were sent to 42 separate organizations or groups known to work with open data. A total of 123 participants responded to the survey but only 99 completed it in its entirety. The most common geographic location of participants was Canada (44%), followed by the European Union (22%) and the United States (18%). The geographic locations of participants are provided in the following table.


In addition, nearly 70% of respondents were between the ages of 25 to 44 and more than 80% of all participants had a university degree. Specific percentages related to age groups and educations are provided within Appendix 6.

Users of Open Data
The users of PSI are generally multi-stakeholders and represent groups that are loosely connected by community interest. The users of OD can be categorized within three separate groups. The first group consists of analysts who prefer data in a raw format with the least amount of filtering applied. This group includes academics, economists and journalists who need to analyse and interpret data in order to contribute to their work-related deliverables. In addition, this group can also include other government organizations or internal departments within the same organization. The second group includes individuals who would prefer dynamic or automated access to data. These are the software developers who need access to clean and reliable data through automated methods such as an Application Programming Interface (API). For the most part their objective is to render the data in a visual interface for a specific audience, which necessitates the need for some level of data analytics. In most cases, this involves aggregating multiple datasets. The last group is comprised of regular citizens or special interest groups who prefer accessing data through an interface with a visual interpretation of the data. Most have limited or no skills for analysing and aggregating datasets and believe that governments are obligated to provide visual interfaces to OD. Organizations working with OD can be found in the following table.

Organizations working with Open Data

Motivation for using Open Data
Interest in OD is developing in a bottom-up direction. Special interest groups and not-for-profit organizations are realising the potential benefits of OD and how it can help them achieve strategic objectives. For most countries, the movement is still just beginning and a lot of people are very interested in learning more about the topic. Notwithstanding, a very small portion of people are using OD in the hope to build applications and/or web services. The large users of OD remain academic institutions and the private sectors.

Micro and Macro data
The benefits of OD information are segregated into two levels of information, micro and macro data. Micro data refers to information that is relevant to individuals and their daily activities, such as bus schedules, road closures and recreational activities. Not surprisingly, smaller governments like municipalities publish micro data. Macro data, on the other hand, consists of data with a wider geographical span such as national and international level information. Macro data can impact larger subsets of the population and be specific to an area or topic. This level of data is dependent on statistical, population and geographical data. In some cases, several micro datasets can be aggregated with macro datasets. Although both are important to citizens, macro data tends to impact a greater number of people. The majority of the OD users that responded to the survey commonly use macro data, including statistical data closely followed by research, population and geographical data. Specific percentages related to the type of data used by survey participants are provided within Appendix 6.

The Source of Open Data
Surprisingly, the source for data most commonly used by survey participants was not open. Significant amounts of data are published as web content on public organizations’ web sites but are not available in an open format. This lack of availability has created a culture of scrapers – individuals who take data from web sites and openly re-use it. That data is then transferred into structured datasets or databases where it can be re-used. In some cases this process is automated, and if web content changes the scraping process is triggered again. It is a complicated method of extracting data and an even more complex way to maintain and update data. In addition, scraping web content falls short of OD license agreements since the data is not available through an OD portal. An interesting outcome of the survey was that 8% of respondents that were working on a specific OD project and working for a federal government organizations were also scraping data. It seems that in some cases it is more cost-effective for governments to scrape themselves rather than to extract the data from its original source. Percentage of macro and micro sources of OD from specific organizations can be found in the following diagram while information related to sources used by separate nations is provided within Appendix 6.

Sources of data by Organization

Preferred methods for accessing Open Data
The preferred methods for accessing OD remain with tools and software that are commonly used or openly available. Formats such as comma-delimited files, which are not proprietary to specific software, are preferred for accessing and managing OD. In addition, the preferred tools used to manage and manipulate OD are also not proprietary; open source software is commonly used but in most cases personal preference outweighs the decision and individuals will use tools that are familiar to them. Specific percentages related to the preferred method for accessing the data by survey participants are provided within Appendix 6.

When asked what benefits are derived from OD initiatives the majority of the participants indicated that OD will support innovation and citizens have the right to public data and that public information should be openly accessible and available online. In addition, releasing new data was more important for participants than improving the quality of existing data. Specific percentages of related questions to the importance and benefits of OD initiatives are provided within Appendix 6.

Open Data Projects
Included in the survey were specific questions related to OD projects. A total of 48 participants responded to these questions. Projects included local initiatives for housing, charities, education and health-related data. International projects were also included, which looked at opportunities with open source software, data visualizations and aggregation with social media and big data. A few projects also included government initiatives for publishing OD and others included specific work toward new standards.

Several participants working on projects belong to different advocacy or special interest groups, which create personal projects in the attempt of making governments accountable for their actions. These groups also lobby governments with an aim to convince them to adopt OD policies. Other personal initiatives included the aggregation of statistical data in an attempt to identify trends to help charities and not-for-profit organizations determine their strategic direction.

From the participants working on specific projects, 23% identified federal governments as their primary client and nearly 84% stated that they are likely to use OD again. The average of primary clients from participants working on specific projects can be found in the following table.


In addition, the following table defines the participants working on specific projects that agreed with each comment.

Comments related to Open Data Projects

Critical issues in government today
The survey offered an open-ended question allowing participants the opportunity to contribute what they believed were the top three issues facing government organizations today. In response to this question, 35% of the participants referred to the relevance of available OD, specifically expressing concern regarding its accuracy and availability. Financial restrictions were also identified as an issue for governments to effectively capture and publish OD. More than 25% of the participants believed that effective dissemination of OD could help governments cut costs and mitigate risks caused by cutbacks.

Furthermore, participants expressed that governments have an embedded organizational culture that does not share information. One quarter of participants stated that governments need to take action and make top-down changes:
The governments will need to build a culture of open data engagement, which will require a change in mindset, and culture. (Participant 97)

There were also several comments regarding the quality of OD. Relevant information about datasets, including the method of collection, metadata and instructions for re-use, is often missing. Inconsistency was also a relevant concern, which relates to a lack of existing standards for information management. In some cases, participants indicated that information about the datasets are simply insufficient to effectively aggregate with other datasets; this was especially true for geographic information. Real transparency was also a common theme. Several participants believed that members of government do not understand the significant importance of OD and the impact it can have for citizens:
Increased visibility of data tends to increase its accuracy and quality through feedback from users. (Participant 74)

The issue of enabling and facilitating public access to open data with visual interface was also a concern.

The interviews included 10 participants, 5 from public organizations and 5 from private organizations.

Public Organizations
The majority of the interview questions for public officials consisted of quantitative questions to identify direct and indirect costs for the preparation and dissemination of OD. However, most governments’ operating costs attributed to OD are small or unknown because they are absorbed within existing information technology operations and sections. Extracting the cost was complex and in most cases organizations were not aware of the actual funds used for OD initiatives. In other cases, some GC departments and agencies were simply not willing to share these costs.

With the exception of large GC departments, manual intervention by staff members is required for the dissemination of new datasets or updates to existing ones. Three public organizations interviewed provided approximate costs to human resource and operating or maintenance costs, including direct and indirect costs. The average cost for the dissemination of OD within these organizations averaged $130 thousand CDN per year. The following table contains the average costs from each organization.


Two organizations spent time developing a cost recovery model for the preparation and dissemination of OD by individual datasets. They believed that estimated time spent to publishing each new dataset is approximately 275 hours, from the analysis to the deployment phase. What was not clear is the indirect costs and time spent on publishing OD from other sections within organizations. What was very clear and voiced from all of the interview participants from public organizations was that the cost of publishing OD would definitely increase.

Furthermore, interview participants commented on the lack of internal policies and standards that in turn reflect on the poor quality of the OD published:
Internal IM practices are lacking, we are building data from the outside in. The internal practices of governments were never designed to manage information, which informs the community in this open format. Governments need to re-think their frameworks. (Interview Participant 4)

Private Organizations
A common concept among interview participants from the private sector was the lack of available data and the difficulty in locating data. These participants work with data on a daily basis and frequently need to locate new and interesting datasets for their projects or initiatives.

Participants indicated that in many cases information was simply not available in an open format but is publicly available online from public web sites. As such, individuals and companies frequently resort to scraping data off public web pages. Although scraping raises many concerns, the consensus among participants is that it is a common practice, even within government organizations.

In the case of one participant that was scraping public web sites, the information he was collecting was simply not available in any other format. By aggregating the scraped data, he was able to create a valuable and appealing dataset, since this information was not available anywhere else. By offering an interface to access this data, he created value by allowing regular citizens to search and query the data without analytical skills. Unfortunately, existing license agreements do not include the scraping of public websites; the web content is publicly available but the right to re-use the data is technically not permitted. Scraped data is simply not OD. Interestingly, the added value of the information that he created by aggregating this data is so significant that members of the GC are subscribed users to his service:

While publishing public information on web sites governments should make the effort to publish data in an open format also. Any data available publicly should also be available in an open format also. (Interview Participant 2)

Another issue brought forward by interview participants was common standards. Standards provide a common method for aggregating OD that can protect privacy. This was especially relevant for geographic standards. The aggregation of OD is dependent on standards across datasets and if separate departments utilize different standards it will take a considerable amount of effort to amalgamate data. As an example, geographical frameworks used within OD vary between GC departments and this lack of a common framework poses challenges to analysts who need to aggregate data across departments. Existing geographies used in datasets can include postal code zones, federal electoral districts, census geographies or health districts. For example, health and environmental issues can span many jurisdictions and geographic boundaries.

Focus Group Session
The focus group session was held during a conference event and included three candidates from the private sector. The goal of the session was to explore ideas in a divergent manner and identify possible solutions for known issues preventing OD from achieving benefits. The group identified transparency as an issue and related the problem to the organizational culture found in government organizations. Experiences from participants were shared among the group; the discussion focused on examples where information was not released in re-usable formats:

When requesting information that was not publicly available online it was provided to me in a compact disk or in a paper format. (Session Participant 2)

Other issues were identified which related to the efforts required to find information. There is still much information that is not yet available in open format, such as information related to grants and contributions. Participants that work on specific research initiatives indicated a need for a large amount of statistical data to identify trends. According to the group, a considerable amount of effort is used to try to determine if the information actually exists and is available.

Research Findings – Literature Review


The following is a brief summary of the literature review for this paper. In the context of the research question, the review touches on elements of effective Information Management (IM), data management and the value of data, organizational culture and the benefits of transparency from sharing Public Sector Information (PSI).

Information Management
The principles of IM include a series of steps for managing information through a predefined lifecycle. These steps include the planning, capturing, organizing, dissemination, preserving, disposition and evaluating of information. The Government of Canada (GC) has a well-defined IM framework which is part of a suite of policies and directives. The IM lifecycle is used to implement and improve IM initiatives and best practices within each GC department and agency. Proper management of information through its lifecycle allows for effective access to information making it available for effective decision making and re-use. The following diagram outlines the IM lifecycle.

IM LifeCycle

With the emergence of OD, the GC has established new operating standards for their OD portal, influenced strongly by the dissemination step of the IM lifecycle. These standards include completeness of data, primary source data, timeliness, ease of access and machine-readable formats, non-discrimination, use of common standards, available licensing agreements, freedom of use, and permanence. These principles allow government organizations to share information in ways that were not initially intended, such as dissemination to the public. To effectively share OD and create a new type of relationship with the public, GC departments and agencies need to consider these principles when managing information.

With new guiding principles and outcomes for data use, the dissemination of OD requires a new cycle for information management. The Center for Technology in Government at the University at Albany in New York has taken a heuristic approach to presenting the flow of data sources related to Open Government Data. It includes an iterative process that comprises of stakeholders with specific roles and both primary and secondary sources of data. The center defines OD published on government portals as primary data sources, managed by primary data resources. Non-government data sources such as geo-coded data and third-party data are considered secondary data sources managed by secondary data resources. The combination and consolidation of these sources is where the potential for innovation and economic growth lies; this potential is currently largely untapped. Aggregating primary and secondary data sources brings in external stakeholders and in turn creates added value to data.

Value of Data
Information and data can sometimes be used interchangeably, but there is a critical difference. Data is considered raw and factual and it has little significance beyond itself. Information is the value we extract from data, which then becomes knowledge and wisdom. For example, data can be represented digitally as tabular rows and columns, which is a structured format that can take many forms. Information on the other hand can be derived from the value that was determined by the data; this requires human input and consideration, and brings forward knowledge and wisdom. Furthermore, on the scale where data, information, knowledge and wisdom relate to each other, value is gained from the knowledge and wisdom that is initially taken from the meaning of data. This paper discusses the benefits of data and more specifically openly access data that provide information, knowledge and wisdom. The relationship between these concepts is exemplified in the following diagram.


Public Sector Information (PSI)
Public Sector Information (PSI) includes all of the information collected by governments and can be “otherwise known as Open Government Data.” PSI can encompass several domains of information including: business or administrative, geographic, legal, meteorological and transportation, plus social or statistical data. Due its breadth and scope, PSI has the potential for a number of economic benefits and advantages.

A 2006 study measuring the European Union plus Norway (EU25) PSI market size estimated that it was worth EUR 27 billion. This equates to approximately 0.25% of their total aggregated GDP. A subsequent review of the 2006 study was conducted in 2011 on the same basis with similar outcomes. This new study for the 27 members of the European Union (EU27) demonstrated a rapid growth of approximately 7% which equated to a PSI market size of EUR 28 billion for 2008 and EUR 32 billion for 2010. PSI can be used in a wide variety of applications to innovate several goods and services. The aggregation of PSI with secondary data adds “further economic and social benefit to the EU27 economy.” In addition, this study also suggests that removing underlying barriers preventing access to data could lead to gains of 10-40% in the geospatial sector alone. Furthermore, if citizens can save two hours per year with more rapid and comprehensive access to public information it would be worth EUR 1.4 billion per year. The European Commission believes that PSI holds a significant amount of potential and that the raw material from PSI can drive innovation and economic activities if open to the public sector.

The benefits of PSI are not limited to primary data. Additional savings could be attained if public organizations were to leverage secondary data creatively with PSI. In the case of the US healthcare system, effective use of external and internal data could create efficiencies of more than $300 billion each year, which the added value of combining data has the potential of reducing healthcare expenditures to the amount of 8%.

The value chain for the re-use of PSI consists of capturing, organizing, packaging and disseminating information. For the purpose of this paper, only the costs and efforts of packaging and dissemination will be analysed since the costs and efforts needed to capture and organize are already incurred within public organizations’ existing IM functions. The value of PSI is derived by the use of the data from all stakeholders. This includes the ability for citizens to provide and contribute their “expertise and perspective to government decision making.”

Transparency, participation and collaboration are needed by governments in order to allow citizens to perform various roles. The sheer volume of data or number of datasets published by a government is not a good indicator of value. Rather, the quality and accessibility of data determines whether value has been created. Data with the most value will be centered on specific stakeholders and their interests as opposed to citizens in general. Understanding who is being served and ensuring openness of data are both critical to creating value.

Adaptable Data
OD can take on many forms, but to be open it must be in a machine-readable format and available through the Internet with licensing agreements. This makes it adaptable and easier to analyze, aggregate and process, which in turn provides greater service delivery to citizens. On the other end of the spectrum, inert data includes printed reports, forms and machine-readable data that are not available through the means of the Internet. This form of data prevents ease of use. If data cannot be analyzed and aggregated dynamically with technology then most of its value is lost. To benefit from the dissemination of OD, governments need to embrace technology and minimize inert data. Ease of access and assessment of adaptable data provides transparency allows for public scrutiny and accountability.

The act of releasing OD demonstrates basic government transparency, but the real benefit of accountability is only obtained with an added “degree of interaction.” This interaction requires that data reaches its intended audience and that mechanisms are in place to allow citizens to react and governments to respond accordingly. This two-directional flow is required in order to reap the full benefits of accountability as opposed to simply releasing data. This interaction can be easily achieved with today’s technology and the release of OD.

Culture of Openness
The dissemination of information needs to be vetted through a series of specific criteria to protect the integrity of the organization. To mitigate this risk, many organizations have implemented a complex assessment process that is not well understood by internal staff because publishing OD is not yet a common practice. Several organizations will only release information that was requested through an access to information request, placing the onus of data transfer on external stakeholders. A government’s ability to solve problems, meet challenges and be innovative is dependent on its ability to flow information to stakeholders. Knowledge is derived from information and to be effective it needs to reach the right person at the right time.

Governments will need to change internal procedures and adapt to methods of disseminating information openly. In 2008, one of the top 10 disruptive technologies was open source software; today it is OD and surrounding technologies. Public scrutiny and the demand from advocacy and special interest groups are triggering the disruptive nature surrounding OD. In addition, OD’s dependency on technology and external factors are likely to cause discontinuous or episodic changes within governments. If not managed properly these could create a culture of resistance to change.

Research description


The purpose of this research project is to examine the benefits and the challenges of publishing Open Data for government organizations. It is presumed that open and accessible data offers multiple benefits, including improved openness and accountability, as well as an increase in innovation and economic growth. This paper aims to help public organizations make sound and informed decisions for extending their Open Data initiatives by determining the social, economic and environmental benefits of publishing Open Data, thereby creating a more cost-effective, transparent, efficient and responsive government.




The analysis has demonstrated how OD can provide social, economic and environmental benefits to society. Several challenges surrounding the dissemination of OD are still preventing these benefits from being achieved. This section provides recommendations for obtaining further benefits with the dissemination of OD.
The five recommendations outlined in this section stem from the analyses, with the objective of achieving the most benefits to society.

Availability of Data
Stronger action is needed by senior officials to prevent issues with compliancy or else availability of datasets will suffer from economic cutbacks. GC departments and agencies need to understand that they are obligated to publish datasets and OD practices need to be part of their IM practices. Legislation surrounding OD along with guidelines and tools needs to be in place to help departments and agencies manage PSI and publish OD.

Recommendation #1: The GC needs to launch the Directive on Open Government to help departments and agencies publish more datasets. As part of the requirement of the directive, there must be standard tools and guidelines such as criteria for publishing that will help departments and agencies identify and publish OD.

In addition, the directive should not allow departments and agencies to pick and choose the data that will be published. Instead, the directive must mandate them to publish all data that meets the criteria for publishing openly. This must also include data from access to information requests and public web sites.

Similar to the Directive on Recordkeeping, the new Directive on Open Government must also contain deadlines and consequences for not meeting requirements. To mitigate the issue of compliancy, requirements of the directive must be key objectives within the performance appraisals for all Information Management Senior Officers (IMSO) and Chief Information Officers (CIO) from each department and agencies.

Furthermore, a communication campaign promoting the importance of the directive should be led by the highest level of senior officials within the GC. Departments and agencies need to understand the urgency and benefits that can be obtained from publishing OD. With effective communication and a proper mandate, departments and agencies will have the ability to achieve compliance and make more data openly available.

Recommendation #2: The GC should set a concrete goal to convert inert data available on their websites into a dynamic open format within the next year. This includes data that is available on public websites from all departments and agencies. If information is publicly available on government websites then it must also be available openly. Data that is only available on physical disks or in a printed format must be converted.

A large amount of OD users are scraping public websites without appropriate licenses. This is an obvious indication that information published on public websites is both needed by OD users and currently unavailable in open format. With a growing number of users wanting to use OD in subsequent projects and the benefits that can be achieved from making more data available, this is a cost-effective approach for the GC to publish data that users are requesting. In addition, this would help support the academic community which are the biggest users of OD and would improve transparency with advocacy and special interest groups which lobby governments for more data to be published.

There are several initiatives from other jurisdictions that could be leveraged to help implement standards, guidelines, and lessons learned from other government departments.

Recommendation #3: The GC needs to identify guidelines and standards for the publishing of OD. A lack of consistency of common formats across departments was identified as an issue for combining datasets which was caused by the lack of defined standards and procedures. This lack of standards only increases costs and time delays for groups needing to aggregate datasets. Additionally, since members of the academic community are the biggest users of macro data available from the GC, the lack of standards is obstructing future innovation from research and development within Canada.

The GC needs to work in collaboration with other jurisdictions to establish standards for metadata and geographic information. Furthermore, tools and procedures are required to help departments and agencies to manage OD. Existing work within other departments like Statistics Canada and Natural Resources Canada could be leveraged for the process of identifying potential standards for publishing geospatial datasets. In addition, the GC should participate in international initiatives for the development of standards and best practices. The European Commission is working on an initiative for a common standard, as is the Open Data Institute in the UK. Providing common standards will help aggregated data and gain added benefits from OD.

Dynamic Data
The GC needs to provide mechanisms that will allow information to flow to all stakeholders. External stakeholders should be seen as partners in the effort to publish OD.

Recommendation #4: The GC needs to collaborate with external stakeholders and all departments and agencies. TBS needs to be the catalyst for establishing a degree of interaction for the continuous flow of information. Provide the ability to change the patterns for interaction among existing and new stakeholders. This interaction requires that data reaches its intended audience and that mechanisms are in place to allow users to contribute. This two-directional flow will allow for transparency from the benefit of public accountability.

The first step of the circular process could be to leverage external stakeholders to validate the need for new datasets. A series of stakeholders that are subject matter experts could be identified to provide feedback. This would give valuable insight into the value of the datasets and enhance the quality of OD published by the GC with relatively no cost.

Furthermore, an advisory board needs to be created to allow stakeholders to collaborate with the publication of OD. External parties should be seen as partners in the effort to publish OD; the skills and efforts for in depth analysis that they offer could help governments reduce internal costs for identifying and publishing OD. External participation will also help create a sense of urgency for OD and remove complacency. In addition, this level of collaboration with external stakeholders would allow for the interaction needed for public accountability. Information will reach targeted audiences, provide a mechanism to react, and allow GC to respond.

Recommendation #5: Changes within GC departments and agencies need to include a lean operational process for publishing OD. An interdisciplinary team within each organization will need to be identified and implemented which will use a rapid and iterative approach to publishing OD. The team will work with internal divisions to identify efficiencies and develop processes for publishing needed data in days and weeks instead of months and years. It will be important for organizations to identify key individuals with the experience and knowledge to propel lean operational changes. In addition, the team will help implement internal procedures and best practices that will meet legislative obligations for publishing OD.




The Internet has changed how organizations operate and people live. One of the most important changes that online connectivity has introduced is the advent of Open Data (OD), a new technological trend that is transforming access to information. Every day, countless people are able to find new information that is open, accessible and re-useable in ways that pre-digital systems simply could not facilitate.

Open Data
The Open Data Institute defines OD as information that can be used by anyone for any purpose and at no cost. OD is information that is available electronically and in a machine-readable format such as Extensible Markup Language (XML), Comma-Separated Values (CSV) and dataset. In most cases, OD is made available through the Internet and is free to be used and re-used without any copyright restriction. This is made possible through the use of license agreements that allow individuals to openly use and re-use data. For effective re-use and whenever possible, OD should be time stamped, available and accessible in an open format using a non-proprietary or open source software, accompanied by useful metadata and geospatial information, which provides innovative and interactive opportunities to aggregate data with maps.

As part of the efforts for driving innovation and economic opportunities, the Government of Canada (GC) launched their online Open Data Portal in March 2011 to centralize freely available data. In April 2012 the GC joined the International Open Government Partnership (OGP) and endorsed the core principles of the multilateral initiative for Open Government Data (OGD). Today the TBS is responsible for the governance, including guidelines and policies, applicable to data, and since its launch the portal has increased its federal department participation and available datasets. The GC is expanding its Open Government initiatives along three main streams: Open Information for the release of information on government activities, Open Data for making information available in a machine-readable format, and Open Dialogue which gives citizens the opportunity to dialogue with its government about policies and priorities. In addition, TB is working on the release of a new and common Open Government Licence for OD and developing a new Directive on Open Government.

Today’s modern technology is providing users’ newfound flexibility to access more information than ever before. Through efforts such as those undertaken by the GC, it is evident that the proliferation of content available on the Internet is empowering citizens and is changing governments. Open Government is part of an effort to use these technologies to make government more open and accessible.

Challenges surrounding the dissemination and release of OD can sometimes prevent or limit its benefits. Publishing OD may involve a number of labour- and time-intensive tasks, such as changing data formats, making sure that information is up-to-date, aligning datasets with existing licenses and meeting criteria for releasing information that could be sensitive. Despite these possible obstacles, the process of publishing data is critical for governments because it is the first step to engaging users and demonstrating transparency. In addition, it can foster internal changes to organizations, such as the implementation of new standards and technologies, and/or changes to organizational and cultural behaviors. It also begins the interactive process needed to validate and achieve a level of quality data.
The success of OD is dependent on the quality of the information. In order to assure that OD is of high quality, related information must be made available. Metadata about the datasets can provide users with information about the data and allow for ease of use by improving public understanding and incorporating other information, such as geographical information.

Project Purpose
This paper will focus on examining the benefits and challenges of publishing OD for government organizations. It will attempt to identify advantages of OD published from the GC Open Data Portal. It is presumed that open and accessible data offers multiple benefits, including improved openness and accountability, as well as an increase in innovation and economic growth. This paper aims to help public organizations make sound and informed decisions for extending their OD initiatives by determining the social, economic and environmental benefits of shared data from public organizations, thereby creating a more cost-effective, transparent, efficient and responsive government.

Glossary of Terms




to gather two or more datasets
together to form one; sometimes referred to as "combining" or "mashing up"



Application Programming Interface
– a middle tier component used to communicate with the data from a user
interface or front end application

Big data


a dataset too large to process
with traditional on-hand database management tools



quantitative values represented in
a structure used to create information (e.g., datasets)



an identifiable collection of data

Comma-delimited file


collection of data in a plain text file, presented in tabular form
where fields are separated by commas



to solicit contributions from a
large group in order to obtain needed services, ideas, or content



the ability to access and use

Open Data


structured electronic information
in a machine-readable format that is accessible and available for use or
re-use without any copyright restriction

Open Source


free software with access to
source code developed in a public, collaborative manner



removing or extracting data from
web pages



Externalities refer to a scenario when unexpected outcomes occur which impact the marginal costs or benefits of a specific effort. In the case of OD, the effort of publishing data results in marginal benefits. Positive externalities are created when further benefits are obtained from the aggregation of data, a series of complex reactions are ignited, and data is shared across multiple jurisdictions with other stakeholders. Marginal benefits combined with positive externalities result in marginal social benefits. The challenge lies with attempting to measure the amount of positive externalities.

An evaluation of qualitative data is required to measure the outcome and performance of publishing OD. We have already identified the marginal benefits as:

  1. Public service delivery
  2. Support of industries and commerce
  3. Re-using of information internally
  4. Support for academic research
  5. Support for evidence based policy decisions
  6. Public accountability

We can also determine some marginal costs from publishing OD which include time spent analysing and publishing data. Some citizen groups and media outlets believe that the release of government information could be misinterpreted and the time spent on unwanted data can be wasteful on government resources. This perspective is influenced by the inherent difficulty of determining the value of data prior to its publication and aggregation.

In order to demonstrate benefits of publishing OD, positive externalities must outweigh any negative externalities. There are few negative externalities from publishing OD that could create marginal social costs. In addition, the action of not publishing OD is not a negative externality derived from the effort of publishing OD. Conversely, when governments don’t publish OD it prevents benefits but it does not reduce the marginal social benefits from the effort of publishing OD. Because of this we can determine that there will always be more positive externalities then negative has showed in the following diagram.


Nevertheless based on the sample of the survey, not publishing OD would have an impact on the most common users, the academic community. They would lose the most from limited access to data. These users possess skills and knowledge for analysis and combining data; not having access to OD would hinder research and development, which in turn would negatively impact future innovation, the release of new products and services, and potential economic benefits. In the 1980s, Canada witnessed such negative impact when Statistics Canada charged the public a fee for data. At the time, the pervading opinion was that the cost of producing powerful data had to be recovered, and that this was best accomplished by charting the user. Canadian research questions went unanswered as some researchers were forced to turn to US or European while and other researchers gave up using these types of data altogether “causing a decade of lost capacity in quantitative expertise.”

