SQL Server 2012 - Unstructured Data Storage (File Table)
In simple words, unstructured data is information which does not fit well in relational model. It is mostly text heavy, but can contain images, videos, emails etc.
File Table, a new feature introduced in SQL 2012, builds on the existing filestream object SQL Server had come up with in 2008 to store unstructured data. Filestream datatype tried to integrate the relational database engine with the windows file system. More information about the filestream object can be found here.
File Table enables non-transactional access to file data. This means the file table data can be accessed through SQL in a transactional way, as well as by the Windows APIs as if it was accessing a file object. File table basically converts SQL tables into folders which can be accessed through Windows explorer. The directory structure and the file attributes are stored into File Table as columns. Files can be bulk loaded, updated as well as managed in T-SQL like any other column. SQL Server also supports backup and restore job for this.

As you can see above, file attributes are stored as columns. stream_id uniquely identifies the file. Data is contained in the filestream column. Most of the columns are read only. The only parameters specified to create an entry in file table is the file name and file directory name.
CREATE TABLE Documentstore AS FileTable
WITH (
FileTable_Directory = 'Documentstore',
FileTable_Collate_Filename = database_default
);
GO
Loading files in File Table is as simple as drag and drop in windows explorer.

There are inbuilt SQL triggers which intercept Windows API and maintain data consistency in File Table in case of insertion/updating/deletion.
select * from Documentstore where name ='FileTable.xml'
![]()
Working with Directories and files
Here are set of useful queries for working with File Table data.
• To know if a particular row belongs to a file or directory:
o select * from documentstore where is_directory=1
• To get the root level path of the file table:
o select FileTableRootPath('DocumentStore')
• To get the relative path of particular file or directory in a file table:
o select file_stream.GetFileNamespacePath() from documentstore
• To get the specific locator ID of a file by providing the path:
o select GetPathLocator('[fullpath]')
• To understand the full path to a file or directory stored in a FileTable:
o DECLARE @rootpath nvarchar(100);
DECLARE @fullpath nvarchar(1000);
SELECT @rootpath = FileTableRootPath();
SELECT top 1 @fullpath = @rootpath + file_stream.GetFileNamespacePath() FROM documentstore
PRINT @fullpath;
GO
The resulting hierarchy looks like this:
\\<machine>\<instance-level-filestream-share>\<database-level-directory>\<File-Table-directory>\
The directory hierarchy which is created at the time of file table creation is a virtual directory hierarchy. This value is stored in SQL DB with datatype=hierarchyid.
Troubleshooting
File handles which are not cleaned up can prevent exclusive access required for certain administrative tasks. To identify open files and associated locks, run the below query:
SELECT opened_file_name
FROM sys.dm_filestream_non_transacted_handles
WHERE fcb_id IN
( SELECT request_owner_id FROM sys.dm_tran_locks )
You can also use SQL Profiler to capture Windows File Open and Close operations in Trace Output for files stored in file table.
Once you have identified open file handles in the database or file table, you can clean them up by running the builtin stored proc:
sp_kill_filestream_non_transacted_handles [[ @table_name = ] 'table_name',[[ @handle_id = ] @handle_id]]
In our subsequent blogs, we will see how can we exploit this file table feature of SQL 2012 to do in-depth analysis of unstructured data and gain useful insights.


Once blank application is created, add following references in the project from the Extensions tab.
After adding references right click on package.appmanifestfile and select view code. In opened XML file add Extensions tag after capabilities tag. This is shown below,
Before running the application we need to make sure that target framework has been changed to X64 in the Configuration Manager. This is because smooth streaming is not supported on ARM.
Master Data Manager also provides the ability to validate data by defining business rules inside the application.
Users also have the ability to bulk upload data through T-SQL scripts by running some of the system SPs provided.
An excel add-in is also introduced to provide ability to bulk upload master data through excel sheets.
As we had already talked in starting, Master Data Management is the need of the hour. Every business, in one way or another, will need a centralized database of master data which it can use to integrate volumes of data coming from different sources, and make it presentable so that its view remains consistent across the organization. Microsoft on its part has introduced a powerful set of MDM components to adhere to these needs. Combined with the set of Integrating and Reporting tools provided by Microsoft, the master data can be put to great use to integrate and analyze the information coming from different systems and make informed business decisions.



