We are pleased to introduce a series of resources dedicated to Simplifying your GIS, starting with Data Management. This series builds upon some basic principles that are often overlooked or underutilized when striving to create a successful GIS at an organization. We have broken these down into three main categories that include:
- Data Management
- Sharing Data
- Apps and Useful Tools
Is your organization just getting started with GIS and storing shapefiles in folders? Or, maybe your organization’s GIS capabilities are mature but unorganized without any uniformity or consistency. This can be a result of employee turnover or working through multiple consultants over the years with different standards and ideas of how a GIS should function. Your actual data structure could be limiting your organization from reaching its GIS initiatives and goals. However, we’re here to help with GIS data management!
This week we are going to focus on the foundation of your GIS, and there is one common practice that can benefit your organization’s GIS more than any other: GIS Data Management. This is often the less flattering part of GIS and can be a touchy conversation for many organizations. However, we’d like to take the time to remind you that this is the foundation of your GIS and without it…well, we already know why you’re reading this. But let’s keep it simple!
The most important element of an effective GIS is the data itself and how it’s stored within the database. Sounds silly, I know, but data management is often the most underappreciated part of an efficient GIS. Lack of management causes a GIS to be filled with incomplete, flawed information and outdated data. Let’s discuss how data standardization can help your organization reach its goals. Adopting a standardized database schema means not only getting a well-documented geodatabase that can be passed onto future GIS managers, but it also means you adopt the large community of users of that same schema. This is valuable in that tools, apps, and scripts may already be created for the task you want to complete.
Working from a common data model allows you and your organization a sound platform for data management, collaboration, data analytics, sharing of maps and data, using pre-built templates or tools (such as Collector for ArcGIS), and creating common export practices. If you are new to GIS or starting from a broken structure then data standardization can help!
Data standardization promotes working within the bounds of a reliable and complete template for how your data is stored. With standardization it’s important to understand the quality requirements of the domains. Let’s take a “fire hydrant” feature for example. If we want to collect a fire hydrant’s location and include information such as the color of it as an attribute, we’d want to define some limits to this. This will limit and structure the field to only available options, streamlining the choices to something like: Red, Blue, Green, Other..Etc. This is creating a domain.
Domains can be a tedious task to create for a whole cities GIS if a domain expert is not available to apply consistency and uniformity. This is where working from an existing data model can be a large benefit! I like to call it getting a jump start with your GIS, because that’s essentially what it is. A database model provides this consistency when getting started. Not all organizations or agencies function or have the same data requirements, but you can work from a model to get started and modify from there.
Everyday at Frontier we work extensively with Esri’s Local Government Information Model, Water Utilities Data Model, and Utility and Pipeline Data Model. These are industry specific models that make setting up or cleaning up your GIS a less daunting task and provide a great foundation for solid data management practices. Want the easy button to find data and have it categorized? Standardization can help.
In the example below, there are 16 parcel layers, and that’s just what fits on the screen. Which one would you use? If this were your GIS, and you were the only one using it and the only one who would ever use it in the future, then no problem. Pick your favorite, and move on. However, it is far more likely that there are multiple people accessing this data. One person might pick the Parcels shapefile, while another might pick the feature class that’s inside a geodatabase labeled 2016. What if they both make edits? Spend too much time going down this path, and you’re really in trouble. Or worse yet, you move on to another job or retire, and leave this confusing mess to someone else who has no idea where to find the current data, and now we’re starting all over.
If this is you, don’t worry, you are definitely not alone. Messy GIS closets are everywhere. But it’s time for some January cleaning. Ditch the clutter and reorganize the good data into a standard, documented schema. The shapefiles, CAD files and feature classes on the left were reorganized into the standardized data model on the right. Some additional pieces had to be added and some unnecessary things removed from the standardized data model, but it is now well-documented and easy for anyone to find the current data.
If you have existing data and are ready to move to a standardized dataset, there are a few variables to consider. One is the quality of the data and this can often times be hard to assume, unless you have valuable metadata, which leads us into our next topic.
What is it? The dictionary defines it as “data that provides information about other data”. We like to call it a lost form of art!
Metadata is the resource that validates the integrity of your data. It is attached to your dataset and should include information about how your GIS layers were created, coordinate system reference frames, the date it was created, and the overall integrity of your data. This can play an important roll if your organization is migrating existing GIS layers into a new standardized model.
Benefits of metadata include:
- A way to help determine the quality of your data
- Understand how the data was created
- Assistance with budgeting – knowing whether data needs to be updated or purchased again
- Documentation for legality issues
- A means of communication to others about your data
- Intelligibility of relationships between data sources (Are fields referenced the same between different datasets?)
These benefits may not make themselves present right away. Another way to look at this is it can be a form of insurance for your data. This is seen in the example of turnover at an organization. It will be an information resource for the next person to come in and understand the data.
Let’s take a look at some specific examples and benefits to storing accurate and logical metadata within your GIS.
The pain of data migration is worth it I promise… but where to begin? You might choose to migrate all existing data into the new geodatabase at once. However, the process of data migration can be cumbersome depending on the dataset and the quality of the existing data. You may realize a more immediate return on investment by first defining the desired product, determining the required data, and then migrating that data into the database model. You should prioritize your information products based on current needs, time and cost to implement. The steps below outline a standard data migration.
Map the Existing Data
Prior to data migration, match existing data with the layers in the new geodatabase model. Mapping data is a time-consuming process that requires input from those who know the data best, and a full Migration Matrix should be built and approved by all involved with the data.
Make Modifications to the Standardized Data Model
Once the data and fields have been mapped, decisions can be made regarding changes to the new standardized geodatabase model and the existing data. Thorough documentation of all modifications must be made. After a standardized data model has been implemented, it is beneficial to update to a later revision when released. Esri offers tools to update to these revised models, but changes you’ve made to your existing data model will not carry over. This is where the documentation of changes will be needed.
Extract, Transform and Load
ETL (Extract, Transform and Load) describes a database process where data is read from a source (extract), converted from its original form (transform), and written to a target database (load). There are multiple choices for loading data. The three most common choices include the Simple Data Loader, Object Loader, or the Data Interoperability Extension for Desktop.
- Simple Data Loader – Use the Simple Data Loader to load data into simple feature classes, meaning feature classes without relationships, attachments or geometric networks.
- Object Loader – Use the Object Loader to load data into more complex feature classes. It is most commonly used to load new data into feature classes participating in a geometric network.
- Data Interoperability Extension – This is a powerful extension to ArcGIS Desktop based on Safe’s FME software. This tool is extremely valuable when moving data from a non-GIS format to a geodatabase or if extensive data transformation is required.
We feel data management is often an overlooked and underutilized practice when implementing a GIS. If you are just starting out or have questions, please don’t hesitate to contact Frontier – we can help! We understand the process and are here to help you grow your GIS.
Our next topic will include resources for Sharing data and information with Esri ArcGIS solutions.
by Jacob Wittenberg and Alison Walker