21.08.2019

File Management Structures

File Management Structures 4,3/5 6844 votes

Written for program managers who have not adopted file management practices for their data analysis projects

Data Management File Structures
File Management Structure Description
File Management Definition

To create a file structure on the PC, you will need to use Windows Explorer, which allows you to browse the folders on your hard drive and on the server. To open Windows Explorer: 1. Right-click (click the right-hand button on the mouse) the button in the lower left-hand corner of the task bar. One thing you'll want to consider as you create a file management system is the treatment of camera original files and the derivative files made from them. While it seems natural to keep these together in one folder structure, there are a number of advantages to separating them into two different directory structures.

Most program managers manage and analyze data as part of their scope of work, but many have not been trained in working with data. In this post I offer the three most simple file management practices - naming conventions, directory structure, and rules for moving files - that keep me organized and working efficiently in data analysis projects.

Naming conventions. Choosing a name for a file or folder, or set of files or folders, should be intentional. One option is to group files or folders in clusters. Ask yourself, do my files or folders have any common themes? If so, use the theme as the first word of the folder, which will cause the folders to cluster.

Another option is to use the alphabet or numerical order to your advantage. The default on my computer is to alphabetize directory structures by name of the file or folder. I use the prefix A, B, C, D, and so on for the folders that I use most frequently. Or I use the prefix 1, 2, 3, 4, and so on.

Many people like to include the date or version number in the file name, which is a fine approach. Consider if you need a date or version number. The directory structure automatically provides the date and time that the file was modified, which may or may not be sufficient for your purposes. Note - if you are programming or using statistical computing with data in a folder, then it is a best practice not to use spaces between names; instead, use underscores or no spaces, and use lowercase letters.

Directory structure for data projects. Creating an intentional directory structure for managing data analysis projects has been helpful advice that I was taught. I learned this process in a quantitative methods class with Dr. Jacob Fowles at the University of Kansas. For every data analysis project, I always start by creating a standard folder structure. The folder structure is essentially the same for every data analysis project that I execute, ensuring effective organization of my work. In the documentation folder, I record the steps that I execute as I go. I deposit data in the datasets folders and work on analyses in the analysis folder. My final products are in the work folder.

You can use Command Line Tools to automate the process of creating the directory structure. Click on the Windows icon, in the search bar type cmd to open Command Line Tools, and then type the source code provided below. The following commands are used: md = make directory and chdir = change directory. Coursera’s free Data Scientist’s Toolbox lecture 18 covers using Command Line Tools for directory structures if you are interested.

Data Management File Structures

md folder1
chdir folder1
md A_Documentation
md B_Datasets B_DatasetsData_Raw B_DatasetsData_Processing B_DatasetsData_Clean
md C_Analysis C_AnalysisAnalysis_Processing C_AnalysisAnalysis_Clean
md D_Work D_WorkDrafts D_WorkPosted

Rules for moving files.For every data analysis project, I use a traceable process so that I can easily reproduce any step and quickly find my data at various steps in the analysis process. When I download or receive a new dataset for a project, I always put the dataset in the raw data folder to retain a copy of the original dataset (see directory structure above). Then I create a copy for the processing data folder. There I clean the data, such as removing duplicates, handling missing values, creating proper capitalization, concatenating columns, etc. When the data is clean, I move the dataset to the clean data folder. This is the final dataset that I use for analysis. Next I begin analysis by first copying the clean dataset for the raw analysis folder. I analyze the data in various ways, and when I am content with my analysis I move it to the clean analysis folder. Last, I create the story for the analysis in the work draft folder. When the story and analysis are complete, I create a copy for the work posted folder. It is important to give the file a clean and concise name prior to circulation to your audience. Too often people circulate files with crazy names, and unintentional files names appear unprofessional.

Warcraft 3 reign of chaos cd key. I hope that your work can benefit from these tips. In summary, basic steps that you can easily adopt include creating intentional naming conventions, establishing a consistent directory structure for data analysis projects, and organizing the movement of data through the analysis project using rules for moving files.

Data management is a robust applied field.This post only covers a few basic concepts. For more reading, see the following resources:

Some Simple Guidelines for Effective Data Management
Elizabeth T. Borer et al.
Bulletin of the Ecological Society of America, 90(2) 205-214

File Management Structure Description

Best Practices for File Naming
Standford University
Data Management Services

File Management Definition

File Management
Cornell University
Research Data Management Services Group