Internals of ‘Shortcuts’ feature in Microsoft Fabric

Microsoft Fabric is a powerful and flexible data platform that enables organizations to process, analyze, and gain insights from their data at scale. One of the key features that sets Fabric apart is its ability to perform in-place IO on files using shortcuts, without the need for physical data movement or duplication. In this article, we will look into the basic questions of Microsoft Fabric shortcuts, exploring how they work and the performance optimization strategy employed by the platform.

Shortcuts The Key to In-Place IO

Microsoft's "Shortcuts" act as virtual pointers to data stored in various locations, both within Fabric and externally, without requiring physical data movement or duplication. The "Shortcuts" feature enables seamless integration with external data sources by performing in-place I/O operations without physically moving or copying data to OneLake. This innovative approach offers several advantages:

Shortcuts allow you to create a single virtual data lake for your entire enterprise, bringing together data from Azure, Amazon Web Services (AWS), and other sources under a unified namespace. This eliminates the need to separately configure each Fabric workload to connect to each data source, as OneLake manages all permissions and credentials. Shortcuts do not consume any compute resources, as they rely on the I/O performance of the remote storage where the data resides. When reading data from a shortcut, the I/O is handled by the external storage (e.g., ADLS Gen2, S3, GCS), and the compute units (CUs) are determined by the type of operation performed within the Fabric engine. This allows for cost-effective data analysis, as the cost of CUs is only incurred for the specific analysis being conducted.

Home

Data duplication

Compared to physically storing data on OneLake, using shortcuts can be more cost-effective. Storing data on OneLake involves CUs for compute and storage transactions, as well as raw storage costs. In contrast, shortcuts leverage the existing storage infrastructure, reducing the need for additional storage costs within Fabric. By leveraging shortcuts, you can eliminate edge copies of data and reduce process latency associated with data copies and staging. Instead of moving data to a central location, you can leave it where it is and access it through shortcuts

Performance

Latency is minimized with shortcuts, as they perform in-place read/write operations. Some customers are willing to trade slight performance differences for the ease of use and reduced pipeline costs offered by shortcuts. When reading files stored in OneLake using shortcuts from the same region as the Fabric capacity, the latency can be further improved compared to reading files from different regions. This is due to the reduced network latency between the Fabric capacity and the OneLake storage within the same region.

Fabric capacity

Microsoft Fabric employs various performance improvement strategies, such as Intelligent Cache, to reduce latency and optimize data access. These strategies ensure that Fabric provides a seamless and efficient data analysis experience, even when working with large datasets or remote data sources.

Shortcut Types and Use Cases

Shortcuts in Fabric can point to data stored in various locations, including.

  • Internal Fabric sources: Lakehouses, Warehouses, KQL databases
  • External sources: Azure Data Lake Storage (ADLS), Amazon S3, and more

You can create shortcuts interactively using the Fabric UI or programmatically using the REST API. Shortcuts can be used to access data in KQL databases using the external_table() function.

Limitations and Considerations

While shortcuts offer many benefits, it's important to consider the following limitations.

  • The maximum number of shortcuts per Fabric item is 100,000
  • Shortcuts in the Tables section can only point to Delta tables, while shortcuts in the Files section can point to any supported format
  • Shortcuts in KQL databases can't be renamed, and only one shortcut can be created at a time

Conclusion

Microsoft Fabric's shortcuts are a game-changer in the world of data processing and analytics. By enabling in-place IO without the need for physical data movement, shortcuts offer a cost-effective and efficient way to access and process data. With minimal latency and Fabric's performance optimization strategies, you can unlock the full potential of your data while minimizing resource consumption and costs. As you embark on your data journey with Microsoft Fabric, keep these insights in mind and leverage the power of shortcuts to streamline your workflows and drive business success.


Similar Articles