← 返回首页
Data science in Microsoft Fabric using Visual Studio Code

Documentation

Topics Overview Overview Linux macOS Windows VS Code for the Web Raspberry Pi Network Additional Components Uninstall VS Code Tutorial Copilot Quickstart User Interface Personalize VS Code Install Extensions Tips and Tricks Intro Videos Overview Setup Quickstart Overview Language Models Context Tools Agents Customization Trust & Safety Overview Agents Tutorial Agents Window Planning Memory Tools Subagents Local Agents Copilot CLI Cloud Agents Third-Party Agents Overview Chat Sessions Add Context Inline Chat Review Edits Checkpoints Artifacts Panel Debug Chat Interactions Prompt Examples Overview Instructions Prompt Files Custom Agents Agent Skills Language Models MCP Hooks Plugins Context Engineering Customize AI Test-Driven Development Edit Notebooks with AI Test with AI Test Web Apps with Browser Tools Debug with AI MCP Dev Guide OpenTelemetry Monitoring Inline Suggestions Smart Actions Best Practices Security Troubleshooting FAQ Cheat Sheet Settings Reference MCP Configuration Workspace Context Display Language Layout Keyboard Shortcuts Settings Settings Sync Extension Marketplace Extension Runtime Security Themes Profiles Overview Voice Interactions Command Line Interface Telemetry Basic Editing IntelliSense Code Navigation Refactoring Snippets Overview Multi-Root Workspaces Workspace Trust Tasks Debugging Debug Configuration Testing Port Forwarding Integrated Browser Overview Quickstart Staging & Committing Branches & Worktrees Repositories & Remotes Merge Conflicts Collaborate on GitHub Troubleshooting FAQ Getting Started Tutorial Terminal Basics Terminal Profiles Shell Integration Appearance Advanced Overview Enterprise Policies AI Settings Extensions Telemetry Updates Overview JavaScript JSON HTML Emmet CSS, SCSS and Less TypeScript Markdown PowerShell C++ Java PHP Python Julia R Ruby Rust Go T-SQL C# .NET Swift Working with JavaScript Node.js Tutorial Node.js Debugging Deploy Node.js Apps Browser Debugging Angular Tutorial React Tutorial Vue Tutorial Debugging Recipes Performance Profiling Extensions Tutorial Transpiling Editing Refactoring Debugging Quick Start Tutorial Run Python Code Editing Linting Formatting Debugging Environments Testing Python Interactive Django Tutorial FastAPI Tutorial Flask Tutorial Create Containers Deploy Python Apps Python in the Web Settings Reference Getting Started Navigate and Edit Refactoring Formatting and Linting Project Management Build Tools Run and Debug Testing Spring Boot Modernizing Java Apps Application Servers Deploy Java Apps GUI Applications Extensions FAQ Intro Videos GCC on Linux GCC on Windows GCC on Windows Subsystem for Linux Clang on macOS Microsoft C++ on Windows Build with CMake CMake Tools on Linux CMake Quick Start C++ Dev Tools for Copilot Editing and Navigating Debugging Configure Debugging Refactoring Settings Reference Configure IntelliSense Configure IntelliSense for Cross-Compiling FAQ Intro Videos Get Started Navigate and Edit IntelliCode Refactoring Formatting and Linting Project Management Build Tools Package Management Run and Debug Testing FAQ Overview Node.js Python ASP.NET Core Debug Docker Compose Registries Deploy to Azure Choose a Dev Environment Customize Develop with Kubernetes Tips and Tricks Overview Jupyter Notebooks Data Science Tutorial Python Interactive Data Wrangler Quick Start Data Wrangler PyTorch Support Azure Machine Learning Manage Jupyter Kernels Jupyter Notebooks on the Web Data Science in Microsoft Fabric Foundry Toolkit Overview Foundry Toolkit Copilot Tools Create Agents Models Playground Agent Builder Agent Inspector Evaluation Tool Catalog Fine-Tuning (Automated Setup) Fine-Tuning (Project Template) Model Conversion Tracing Profiling (Windows ML) FAQ File Structure Manual Model Conversion Manual Model Conversion on GPU Setup Environment Without Foundry Toolkit Template Project Migrating from Visualizer to Agent Inspector Overview Getting Started Resources View Deployment VS Code for the Web - Azure Containers Azure Kubernetes Service Kubernetes MongoDB Remote Debugging for Node.js Overview SSH Dev Containers Windows Subsystem for Linux GitHub Codespaces VS Code Server Tunnels SSH Tutorial WSL Tutorial Tips and Tricks FAQ Overview Tutorial Attach to Container Create Dev Container Advanced Containers devcontainer.json Dev Container CLI Tips and Tricks FAQ Default Keyboard Shortcuts Default Settings Substitution Variables Tasks Schema
Copy as Markdown

On this page there are 9 sections

Data science in Microsoft Fabric using Visual Studio Code

You can build and develop data science and data engineering solutions for Microsoft Fabric within VS Code. Microsoft Fabric extensions for VS Code provide an integrated development experience for working with Fabric artifacts, lakehouses, notebooks, and user data functions.

What is Microsoft Fabric?

Microsoft Fabric is an enterprise-ready, end-to-end analytics platform. It unifies data movement, data processing, ingestion, transformation, real-time event routing, and report building. It supports these capabilities with integrated services like Data Engineering, Data Factory, Data Science, Real-Time Intelligence, Data Warehouse, and Databases. Sign up for free and explore Microsoft Fabric for 60 days — no credit card required.

Prerequisites

Before you get started with Microsoft Fabric extensions for VS Code, you need:

Installation and setup

You can find and install the extensions from the Visual Studio Marketplace or directly in VS Code. Select the Extensions view (⇧⌘X (Windows, Linux Ctrl+Shift+X)) and search for Microsoft Fabric.

Which extensions to use

Extension Best For Key Features Recommended for you if… Documentation
Microsoft Fabric extension General workspace management, item management and working with item definitions - Manage Fabric items (Lakehouses, Notebooks, Pipelines)
- Microsoft account sign-in & tenant switching
- Unified or grouped item views
- Edit Fabric notebooks with IntelliSense
- Command Palette integration (Fabric: commands)
You want a single extension to manage workspaces, notebooks, and items in Fabric directly from VS Code. What is Fabric VS code extension
Fabric User data functions Developers building custom transformations & workflows - Author serverless functions in Fabric
- Local debugging with breakpoints
- Manage data source connections
- Install/manage Python libraries
- Deploy functions directly to Fabric workspace
You build automation or data transformation logic and need debugging + deployment from VS Code. Develop User data function in VS code
Fabric Data Engineering Data engineers working with large-scale data & Spark - Explore Lakehouses (tables, raw files)
- Develop/debug Spark notebooks
- Build/test Spark job definitions
- Sync notebooks between local VS Code & Fabric
- Preview schemas & sample data
You work with Spark, Lakehouses, or large-scale data pipelines and want to explore, develop, and debug locally. Develop Fabric notebooks in VS Code

Getting started

Once you have the extensions installed and signed in, you can start working with Fabric workspaces and items. In the Command Palette (⇧⌘P (Windows, Linux Ctrl+Shift+P)), type Fabric to list the commands that are specific to Microsoft Fabric.

Fabric Workspace and items explorer

The Fabric extensions provide a seamless way to work with both remote and local Fabric items.

  • In the Fabric extension, the Fabric Workspaces section lists all items from your remote workspace, organized by type (Lakehouses, Notebooks, Pipelines, and more).
  • In the Fabric extension, the Local folder section shows a Fabric item(s) folder opened in VS Code. It reflects the structure of your fabric item definition for each type that is opened in VS Code. This enables you to develop locally and publish your changes to current or new workspace.

Use user data functions for data science

  1. In the Command Palette (⇧⌘P (Windows, Linux Ctrl+Shift+P)), type Fabric: Create Item.

  2. Select your workspace and select User data function. Provide a name and select Python language.

  3. You are notified to set up the Python virtual environment and continue to set this up locally.

  4. Install the libraries using pip install or select the user data function item in the Fabric extension to add libraries. Update the requirements.txt file to specify the dependencies:

    import datetime import fabric.functions as fn import logging # Import additional libraries import pandas as pd from sklearn.ensemble import RandomForestClassifier from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score import joblib udf = fn.UserDataFunctions() @udf.function() def train_churn_model(data: list, targetColumn: str) -> dict: ''' Description: Train a Random Forest model to predict customer churn using pandas and scikit-learn. Args: - data (list): List of dictionaries containing customer features and churn target Example: [{"Age": 25, "Income": 50000, "Churn": 0}, {"Age": 45, "Income": 75000, "Churn": 1}] - targetColumn (str): Name of the target column for churn prediction Example: "Churn" Returns: dict: Model training results including accuracy and feature information ''' # Convert data to DataFrame df = pd.DataFrame(data) # Prepare features and target numeric_features = df.select_dtypes(include=['number']).columns.tolist() numeric_features.remove(targetColumn) X = df[numeric_features] y = df[targetColumn] # Split and scale data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # Train model model = RandomForestClassifier(n_estimators=100, random_state=42) model.fit(X_train_scaled, y_train) # Evaluate and save accuracy = accuracy_score(y_test, model.predict(X_test_scaled)) joblib.dump(model, 'churn_model.pkl') joblib.dump(scaler, 'scaler.pkl') return { 'accuracy': float(accuracy), 'features': numeric_features, 'message': f'Model trained with {len(X_train)} samples and {accuracy:.2%} accuracy' } @udf.function() def predict_churn(customer_data: list) -> list: ''' Description: Predict customer churn using trained Random Forest model. Args: - customer_data (list): List of dictionaries containing customer features for prediction Example: [{"Age": 30, "Income": 60000}, {"Age": 55, "Income": 80000}] Returns: list: Customer data with churn predictions and probability scores ''' # Load saved model and scaler model = joblib.load('churn_model.pkl') scaler = joblib.load('scaler.pkl') # Convert to DataFrame and scale features df = pd.DataFrame(customer_data) X_scaled = scaler.transform(df) # Make predictions predictions = model.predict(X_scaled) probabilities = model.predict_proba(X_scaled)[:, 1] # Add predictions to original data results = customer_data.copy() for i, (pred, prob) in enumerate(zip(predictions, probabilities)): results[i]['churn_prediction'] = int(pred) results[i]['churn_probability'] = float(prob) return results
  5. Test your functions locally, by pressing F5.

  6. In the Fabric extension, in Local folder , select the function and publish to your workspace.

Learn more about invoking the function from:

Use Fabric notebooks for data science

A Fabric notebook is an interactive workbook in Microsoft Fabric for writing and running code, visualizations, and markdown side-by-side. Notebooks support multiple languages (Python, Spark, SQL, Scala, and more) and are ideal for data exploration, transformation, and model development in Fabric working with your existing data in OneLake.

Example

The cell below reads a CSV with Spark, converts it to pandas, and trains a logistic regression model with scikit-learn. Replace column names and path with your dataset values.