Content
# Data Migration using Conversational Interfaces
Traditional data migration to the cloud often follows a lift-and-shift approach that requires significant technical expertise. While domain experts possess deep understanding of the business value of data, they often face challenges accessing the necessary tools or technical expertise required to migrate specific datasets for generating insights.
This project presents a pattern that empowers domain experts to perform selective data migration through conversational interfaces by leveraging Model Context Protocol (MCP) servers and Microsoft Fabric. This approach transforms data migration from a technical barrier into an intuitive, business-user-friendly process that maintains data governance while accelerating time-to-insight.
## Features
- **Conversational Data Discovery**: Use natural language to identify and explore data artifacts
- **Seamless Cloud Migration**: Automated migration process that preserves data integrity
- **Data Consolidation**: Combine multiple data sources into unified datasets
- **Interactive Querying**: Query and analyse migrated data through a conversational interface
## Architecture
The system architecture demonstrates the end-to-end flow of data migration and conversational interaction:

## Process
### Pre-requisites
- Access to JDBC-compatible source database(s)
- Access to a Microsoft Fabric workspace to create a Lakehouse & Data Agent
### Set-up
1. **Set-up the Source Database**: A JDBC-compatible database like Oracle or PostgreSQL can be used as a source. A script for creating a table with some sample data in an Oracle database has been provided in [scripts/create_employee_table.sql](scripts/create_employee_table.sql)
2. **Set-up the Quarkus MCP Server for JDBC**: Follow the setup instructions at [Quarkus MCP JDBC Server](https://github.com/quarkiverse/quarkus-mcp-servers/tree/main/jdbc#claude-desktop-config) to configure the JDBC MCP server for database connectivity. If you're using VS Code with GitHub Copilot as the MCP Client, set-up the config in VS Code.
3. **Download and Configure OneLake File Explorer**: Download OneLake File Explorer from [Microsoft Download Center](https://www.microsoft.com/en-us/download/details.aspx?id=105222) and configure it to connect to your Microsoft Fabric workspace. This will enable seamless data movement between your local environment and the Microsoft Fabric Lakehouse.
### Usage
1. **Data Discovery**: Use your MCP Client (e.g. VS Code with GitHub Copilot or Claude Desktop) to explore the data source through conversational queries. Start by asking questions like "What tables are available in the database?". If you've created the employee table using the provided script, the table structure will look like this:

2. **Extract Data**: Notice that the employee table contains a `salary` column which we'll exclude from the migration process for privacy/compliance reasons. Use the prompt provided in [prompts/extract_data.txt](prompts/extract_data.txt) to extract all employee information while excluding the salary details. Note that the data can be extracted as a file placed locally in the Fabric Lakehouse's Files folder which will automatically sync it with Microsoft Fabric. A sample CSV file containing this data is available in [files/empdata.csv](files/empdata.csv). An example of this process using VS Code with GitHub Copilot (Claude Sonnet 4) is shown below:

3. **Combine Datasets**: Once the information is available in the Fabric Lakehouse, combine it with other data artifacts to generate deeper insights. If you're using the employee dataset, a file containing department locations is available in [files/deptloc.csv](files/deptloc.csv) which can be used to link employees to locations. In the Fabric Lakehouse, the tables would look like this:

4. **Build Conversational Interface**: To expose the consolidated dataset as a conversational interface which can be queried by domain experts in natural language, create a Data Agent in Microsoft Fabric using the instructions provided in [Create Fabric Data Agent](https://learn.microsoft.com/en-us/fabric/data-science/how-to-create-data-agent). If you're using the datasets mentioned above, you can use the sample instructions mentioned in [Data Agent Instructions](prompts/data_agent.txt). A screenshot of this is shown here:

5. **Query in Natural Language**: Domain experts can now query the consolidated dataset using natural language. In the example below, the Agent is asked a question which requires information from 2 tables to be combined. The image contains the output from the Agent, along with the SQL query it generated behind the scenes which joins the tables.

## Conclusion
This pattern demonstrates how conversational interfaces can transform the traditional data migration process, making it accessible to domain experts without requiring deep technical expertise. By combining the power of MCP servers with Microsoft Fabric's cloud-native data platform, organisations can:
- **Democratise Data Migration**: Enable business users to identify, extract, and migrate relevant data assets through natural language interactions
- **Maintain Data Governance**: Implement selective migration strategies that exclude sensitive information while preserving analytical value
- **Accelerate Time-to-Insight**: Reduce the complexity and time required to consolidate data from multiple sources in the cloud
- **Enable Self-Service Analytics**: Provide domain experts with conversational interfaces to query and analyse consolidated datasets without SQL knowledge
Connection Info
You Might Also Like
markitdown
Python tool for converting files and office documents to Markdown.
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
TrendRadar
TrendRadar: Your hotspot assistant for real news in just 30 seconds.
mempalace
The highest-scoring AI memory system ever benchmarked. And it's free.
mempalace
The highest-scoring AI memory system ever benchmarked. And it's free.