1. Introduction
Report generation is an integral part of enterprise application development. Choosing the right kind of report generation strategy decides the success of the application in terms of cost and performance. The whitepaper discusses some of the key factors that need to be considered when architecting reporting systems.
Picking the right kind of reporting tools also plays a significant role in deciding the scalability and cost of the system. We will discuss some of the reporting tools which are widely used in the existing reporting systems. Selecting the best out of these tools is a crucial decision. SQL reporting service is one of the hot reporting tools; the white paper gives an introduction to the tool.
Comparing reporting tools in terms of performance and cost is a really difficult job; and these measures will vary from architecture to architecture. But comparing the .NET features available in these tools is very interesting for a .NET chap. The white paper attempts to perform a comparative study of the common tools available in the market.
Excel reporting is an integral part of many financial applications. Many retail applications require Excel reports with fairly large amount of data and good format. The simple approach which comes into our mind will be to uses Excel component. But Microsoft doesn't recommend this approach. One of the best alternatives is explained in the Excel Reporting section.
Securing the data exposed by reports is also a critical decision. The white paper discusses some of the common strategies used in large enterprise systems to secure the data exposed by reports.
2. Reporting Tools
Let's discuss some of the reporting tools which are widely used in the existing reporting systems. Selecting the best out of these tools is a crucial decision. Three main reporting tools available in the .NET code space are:-
a) Crystal Reports
Crystal Reports (for Visual Studio .NET) is the standard reporting tool for Visual Studio .NET. It brings the ability to create interactive, presentation-quality content; to the .NET platform. Crystal report is an integral part of the Visual Studio .NET environment.
Using Crystal reports, developers can host reports on web and windows platforms and publish Crystal reports as report web services on a web server. Crystal Reports for Visual Studio .NET is based on the framework of Crystal Reports 8.0, but has been developed for the .NET platform to provide .NET developers with a richest API.
Crystal Reports for Visual Studio .NET provides developers with data visualization and analysis capabilities through seamless integration within all the Visual Studio languages targeting .NET. It uses an open and flexible architecture, with standards like XML, to share reports and information over the Web. Report presentation in both Windows and HTML are available.
Using Visual Studio .NET it is possible to create a new Crystal report or add an existing Crystal report to a project. Reports can be kept on a local machine or published as a web service on a web server. Depending on whether it is a windows or web application, it is possible to bind the report with either the Windows Forms Viewer or the Web Forms Viewer. Users can run the Windows application on a Windows platform or deploy the web application on a client browser to view the Crystal report.
Crystal Reports for Visual Studio .NET works in a single-tier, two-tier, and three-tier architecture on both Windows Forms and Web Forms platforms.
Single-Tier Applications
A Windows application that contains a local report is based on a single-tier approach where all the components are installed on each user's machine. Report components, including the report file (.rpt file) and the Report Engine, are installed with the executable file on the user's machine.
Scalability
Windows applications with local reports are best suited for small-scale deployment since the application is installed on each user's machine. This type of deployment relies heavily on client-side processing. If any changes to the report are required, the application needs to be reinstalled on the user's machine.
Two-Tier Applications
In the case of a web application that hosts a local report (.rpt file); the processing is based on a two-tier relationship where any events raised on the client are handled on the Web server.
Scalability
Users require only a browser on their machines. All report components reside on the server and can be updated centrally.
Three-Tier Applications
A Report Web Service is a Crystal report that has been published, or exposed, for consumption by an application. Over the Internet, a Windows or Web application can connect to a Report Web Service, consume the exposed Crystal report, and display the report for users by hosting it in a report viewer. It is supported with a Report Web Service that is part of the architecture, or can access a Report Web Service that is part of architecture.
Scalability
Depending on the type of application, users require only a Windows Forms Viewer or a Web browser to view Report Web Services. All other report components reside with the Report Web Service on a remote server. Any changes made to the Report Web Service are reflected immediately in the application.
Enterprise Applications
If a distributing application needs to support report viewing and delivery across a large company, it requires more than one report engine to handle all users' report queries. Crystal Enterprise provides organizations with a scalable, secure Web-based management and delivery system for Crystal .rpt files to any authorized user, inside the organization or beyond.
Using Visual Studio .NET, it is possible to build applications which can access Crystal Enterprise servers. This enables designers to build a small client which can be distributed to thousands of users and use the Crystal Enterprise servers to handle report management, processing, and scheduling.
Scalability
Crystal Enterprise allows organizations to scale up (add more processors) or scale out (add more physical machines) to meet growing user demands. This system delivers web speed performance for both on-demand and scheduled reports. Crystal Enterprise is designed to scale to the enterprise and deliver rich, interactive reports to a broad range of users.
Licensing
Now, let's have a look at the licensing details related with Crystal reports. Crystal Reports have mainly four editions; Crystal reports integrated with .NET, Developer, Advanced and Professional. Major features of Crystal Developer/ Advanced reports are mobile device support, additional export formats, and improved flexibility while connecting and retrieving data from data bases.
'Crystal reports for .NET' is mainly suited for designing and testing server or web based applications. It is also suited for thick client windows applications i.e. applications in which reporting engine is installed and run from the end user's client machine. It is not suited for a Web farm environment. To redistribute a server or web application that uses the .NET reporting component to third parties (outside the organization), requires a written authorization from Crystal. This procedure does not associate any extra cost.
b) Crystal Enterprise Reports
Crystal Enterprise is a web-based enterprise report management and distribution system. Crystal Enterprise can add value to the existing .NET reporting functionality since it provides a secure, scalable, multi-tier, web-based, reporting solution. With this framework, users throughout the organization, business partners, and customers can access critical information via the web. Some of the main features of Crystal Enterprise are described below.
- Security
Unlike Crystal Reports, Crystal Enterprise incorporates a security system for managing content delivery to both internal and external users. It offers both user level and object level security. It also offers the ability to log all actions on the Crystal Enterprise server for security and auditing purposes.
- Scheduling
Crystal Enterprise includes a scheduling system that enables report processing at specific times or on recurring schedules. Scheduling is done on a clusterable server component to ensure that crucial processing schedules are not lost or ignored in the event of hardware failure. Scheduling can increase system performance by having the reports ready to view before they are needed.
- Load Balancing
In a multi-server environment, balancing the load efficiently across multiple machines greatly enhances scalability and end user response time. Load balancing also ensures effective use of hardware and decreases performance bottlenecks. Crystal Enterprise includes built-in load balancing across all system management and report processing functions.
- Fail-Over
A key component of the Enterprise architecture is built-in support for fault-tolerance. This involves the ability to cluster key components of the system to ensure maximum uptime and responsiveness, and the ability to fail-over between machines in a cluster in the event of hardware failure.
- Caching
This Web optimization approach increases the responsiveness of Crystal Enterprise. Caching stores previously accessed report pages in a rapid access file system so that they can be dispatched to other users. This is faster then processing an entire report again.
- Integrating Crystal Enterprise with .NET applications
Crystal Enterprise can be incorporated into .NET application using SOAP or COM. Depending on the method used for calling the Crystal Enterprise, different functionality can be retrieved.
Calling Crystal Enterprise through SOAP allows us to get a listing of reports and to view the reports with either the Web Forms Viewers or the Windows Forms Viewers controls.
Crystal Enterprise also exposes a full COM-based object model to enable a more complete method of integration. This COM-based SDK offers a single entry point to the Crystal Enterprise Web services. This object model encapsulates all the calls needed to extract report listings, control the processing and delivery of reports, view reports, and generally interact with the Crystal Enterprise services.
c) Active Reports
Active Reports for .NET from Data Dynamics can be used along with the Visual Studio editor to write code in either C# or Visual Basic.NET to generate reports. Active Reports can be viewed in web browsers. Active Report component includes an ASP.NET server control to setup web client viewer options as well as a Windows Forms viewer control that offers split and multi-page views, text-searches, table of contents and a customizable toolbar. Since Active Reports is not the default Visual Studio item, it is wise to have a look at the key features provided by it.
Features
- ASP.NET Support
Active Reports can be used in web applications by using the ASP.NET Server Control. This control supports web report distribution using HTML, ActiveX, .NET and PDF viewers. The Web Viewer control allows quick viewing of Active Reports on the web as well as printing capability with the ActiveX viewer and Acrobat Reader viewer Type properties. RPX files HTTP Handler allows developers to simply place a report on the web server and make it available to the Internet users in HTML or PDF without writing any code.
- Integration with Visual Studio.NET
Microsoft Visual Studio.NET editor can be used to write C# or VB.NET code to customize reports. It is possible to build reports dynamically. Report Creation API provides complete runtime access to report objects. Report Wizards are available to generate simple and quick reports. The end user designer can be used to host the designer in development application and provide end user report editing capabilities. The designer component can be used for customizing the display.
- Deployment
Easy to deploy Reporting Engine is a single managed strong named assembly. Assemblies can be distributed using XCopy or placed in the global assembly cache (GAC). Professional Edition includes - HTTP handlers and server side web control for ASP.NET reporting
- Licensing
Active Reports come in two flavors; Professional and Standard edition. Choosing the right approach depends up on the cost and application needs. End User Report Designer, ASP.NET integration, HTTP handlers and Web Viewer control are the major features of Active Reports Professional edition compared to Standard edition. See the comparison section for more details.
d) SQL Reporting Service
SQL Server Reporting Services is a comprehensive server-based solution that enables the creation, management, and delivery of both traditional, paper-oriented reports and interactive web-based reports. An integrated part of the Microsoft business intelligence framework, Reporting Services combines the data management capabilities of SQL Server and Microsoft Windows Server with familiar and powerful Microsoft Office System applications to deliver real-time information to support daily operations and drive decisions. SQL Server reporting services is strongly integrated with SQL Server. But it doesn't mean that report data source should be SQL Server. It can be Oracle DB2, Informix, and Sybase etc.
It comes in two flavors; Standard and Enterprise editions. Enterprise edition offers more features such as web farm configurations, security extension APIs, supports more than four CPU, supports more than 2GB RAM etc.
Strong Integration with XML
The entire report and the data source are stored in an XML file. This allows more flexibility for the developers in generating dynamic reports. The report design is strongly integrated with Visual Studio .NET environment. Report developers can create reports to be published to the Report Server using Microsoft or third-party design tools that use Report Definition Language (RDL), an XML-based industry standard used to define reports.
Security
Supports Windows based authentication and role based authentication. Better security model compared to any other reporting tool.
Output Formats
SQL Reporting Services support Excel, TIFF, HTML, PDF, CSV and XML formats. It also supports Office Web Components and Web Archives.
Report Management
Report definitions, folders, and resources are published and managed as a Web service. Managed reports can be executed either on demand or on a specified schedule, and are cached for consistency and performance. SQL Server Reporting Services supports both on-demand (pull) and event-based (push) delivery of reports. Users can view reports in a Web-based format or in e-mail. Scheduling of reports, web farm support, report history are some other major features
It is also characterized with a well defined set of APIs which can be used for customizing report management, securing and displaying reports. Subscription and delivery support, Visual Studio.NET integrated report designer, featured and powerful report processing etc are some of the other features.
e) Comparative Study
It is nice to have a comparative study of the features of the reporting tools, we have discussed so far. Since Crystal Enterprise focus is more towards enterprise reporting, here we are comparing Active Reports standard, Active Reports Professional, Crystal Reports, and Crystal Reports Developer edition.
The following table gives a rough idea regarding the advantages and disadvantages of various reporting tools over the other. The comparison doesn't look into the reporting features; instead concentrate more on the .NET support.
3. Effective Excel Reporting
Enterprise applications often depend on Excel reports for interacting with other ERP systems such as JD Edwards. Projects which fall under the financial and retail verticals use Excel spreadsheets heavily. Excel automation in a client side environment usually makes use the Excel object. But this section focuses mainly on how to take advantage of Excel in ASP.NET environment.
Implementing Excel automation in the server side is not that difficult with ASP.NET. But making it scalable and deployable is not an easy job. Using Excel object in an ASP.NET environment creates many concerns in the long run. In fact, Microsoft also doesn't support or recommend server side automation of Office.
Problems Using Automation of Office Server-Side
Designers need to be aware of six major concerns associated with Excel report generation using Office automation. A designer should be aware of how to address these concerns and how to minimize their effects as much as possible. Consider these items carefully while automating Excel or Office because no one solution can address all of them, and different designs require us to prioritize the elements differently.
- User Identity: Office Applications assume a user identity when they are run, even when they are started by Automation. They attempt to initialize toolbars, menus, options, printers, and some add-ins based on settings in the user registry hive for the user who launches the application. ASP.NET runs under 'ASPNET' user account, which has got limited access privileges, and therefore Office may fail to initialize properly on startup. Even if the Office application can be started, usually it doesn't happen, other functions may fail to work properly. So we have to perform impersonation either through code or through web configuration file. Implementing impersonation using code also dangles security concerns.
- Interactivity with the Desktop: Office Applications assume that they are being run under an interactive desktop, and may in some circumstances need to be made visible for certain automation functions to work properly. If an unexpected error occurs, or an unspecified parameter is needed to complete a function, Office is designed to prompt the user with a modal dialog box that asks the user what they want to do. A modal dialog box on a non-interactive desktop cannot be dismissed, which causes that thread to hang. Although certain coding practices can help reduce the likelihood of this occurring, they cannot prevent it entirely.
- Reentrancy and Scalability: Server-side components need to be highly reentrant, multi-threaded COM components with minimum overhead and high throughput for multiple clients. Office Applications are in almost all respects the exact opposite. They are non-reentrant, STA-based Automation servers that are designed to provide diverse but resource-intensive functionality for a single client. They offer little scalability as a server-side solution, and have fixed limits to important elements, such as memory, which cannot be changed through configuration. More importantly, they use global resources (such as memory mapped files, and shared Automation servers), which can limit the number of instances that can run concurrently and lead to race conditions if they are configured in a multi-client environment. Developers who plan to run more then one instance of any Office Application at the same time need to consider "pooling" or serializing access to the Office Application to avoid potential deadlocks or data corruption.
- Resiliency and Stability: Office 2000, Office XP, and Office 2003 use Microsoft Windows Installer (MSI) technology to make installation and self-repair easier for an end user. MSI introduces the concept of "install on first use", which allows features to be dynamically installed or configured at runtime (for the system, or more often for a particular user). In a server-side environment this both slows down performance and increases the likelihood that a dialog box may appear that asks for the user to approve the install or provide an appropriate install disk. Although it is designed to increase the resiliency of Office as an end-user product, Office's implementation of MSI capabilities is counterproductive in a server-side environment. Furthermore, the stability of Office in general cannot be assured when runs in the server side because it has not been designed or tested for this type of use. Using Office as a service component on a network server may reduce the stability of that machine and as a consequence the network as a whole. If the plan is to automate Office server-side, attempt to isolate the program to a dedicated computer that cannot affect critical functions, and that can be restarted as needed.
- Server-Side Security: Office Applications were never intended for use server-side, and therefore do not take into consideration the security problems that are faced by distributed components. Office does not authenticate incoming requests, and does not protect from unintentionally running macros, or starting another server that might run macros, from the server-side code. Do not open files that are uploaded to the server from an anonymous Web! Based on the security settings that were last set, the server can run macros under an Administrator or System context with full privileges. In addition, Office uses many client-side components (such as Simple MAPI, MSDAIPP) that can cache client authentication information in order to speed up processing. If Office is being automated server-side, one instance may service more than one client, and because authentication information has been cached for that session, it is possible that one client can use the cached credentials of another client, and thereby gain non-granted access permissions by impersonating other users.
- Performance Cost: Using Excel object in the server side ASP.NET code to automate Excel automation also reduces performance throughput compared to alternative approaches. Each excel operation involves a COM call, which involves Runtime Callable Wrapper as the intermediate, significantly affects the speed while generating relatively large reports.
Besides these technical concerns, a designer should also consider the feasibility of such a design with respect to licensing. Current licensing guidelines prevent Office Applications from being used on a server to service client requests, unless those clients themselves have licensed copies of Office. Using server-side Automation to provide Office functionality to unlicensed workstations is not covered by the End User License Agreement ().
Alternative approaches
Microsoft strongly recommends that we, developers, should find alternatives to Automation of Office while developing server-side solutions. Because of the limitations to Office's design, changes to Office configuration are not enough to resolve all issues. Microsoft recommends a number of alternatives that do not require Office to be installed server-side, and that can perform most common tasks more efficiently and quickly than Automation.
The most recommended approach is to generate a XML file or HTML file, which in turn represents the spreadsheet. Because Office 2000 and later support HTML as a native document format, most documents can be created in HTML, using XML markup when needed, and streamed to a client by using a MIME type so that the resulting text is displayed in Office. The document can be edited, saved, and even returned to the server when needed, by using nothing more than ASP.NET on the server. For earlier versions of Office, other easily manipulated text formats can be used.
Some native binary formats can be edited by using the Office Web Components or ActiveX Data Objects (ADO) with much greater speed and scalability. Office Web Components can be used to generate spreadsheets with limited features. But the performance cost associated with these components is less compared to directly using Office Component. Document properties can be viewed or changed without Automation, and file management and versioning is possible by using FrontPage Server Extensions or Distributed Authoring and Versioning. When Automation is essential, most tasks can be offloaded to the client, providing better stability and scalability for the system because each user runs the task in their own context, with their own settings.
Case Study
In our last project we were asked to generate Excel reports for integration with JD Edwards. Initially we followed the traditional approach to generate the Excel reports. These reports contain lot of formulas, comments and formatting so that the users can directly work on these Excel reports.
The following code depicts the approach that we have used to generate the Excel report:-
Excel.Application excelApplication;
Excel._Workbook excelWorkBook;
Excel._Worksheet excelWorkSheet;
Excel.Range excelRange;
// Perform garbage collection to free existing resources.
C.Collect();
// Create the excel object
excelApplication = new Excel.Application();
excelApplication.Visible = false;
// Create a new workbook.
excelWorkBook = (Excel._Workbook)
(excelWorkBook.Workbooks.Add(Type.Missing));
// Get the active sheet
excelWorkSheet = (Excel._Worksheet)excelWorkBook.ActiveSheet;
// Print the details
for(int index=0; index < noOfRecords; index++)
{
// Do some processing
excelWorkSheet.Cells[index, 1] = "RecordValue";
// Code to perform more formatting
}
//AutoFit columns A:Z.
excelRange = excelWorkSheet.get_Range("A1", "Z1");
excelRange.EntireColumn.AutoFit();
excelApplication.Visible = false;
excelApplication.UserControl = false;
excelWorkBook.SaveAs(temporaryFileName,
Excel.XlFileFormat.xlWorkbookNormal, null, null, false, false, Excel.XlSaveAsAccessMode.xlShared,
false, false, null, null, null);
// Need all following code to clean up and free all references!!!
excelWorkBook.Close(null,null,null);
excelApplication.Workbooks.Close();
// Free the resources
excelApplication.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(excelRange);
System.Runtime.InteropServices.Marshal.ReleaseComObject(excelApplicatn)System.Runtime.InteropServices.Marshal.ReleaseComObject(excelWorkShet);
System.Runtime.InteropServices.Marshal.ReleaseComObject(excelWorkBook);
// Set the object values to null
excelWorkSheet = null;
excelWorkBook = null;
excelApplication = null;
// Force Garbage collection
GC.Collect();
The Excel object used was 'Microsoft Excel 11 library'. From the code, it can be understand that most of the formatting involves a RCW call. For printing around 6000 rows, it took 2-3 minutes on a 700 MHz Pentium IV computer.
The performance of the report was according to the Customer expectations. So we started thinking about other possibilities. Luckily, before delivering the project to the client we could find an alternative solution. Thanks to Microsoft!
We got the solution from one of the MSDN links. Instead of generating the Excel reports using Excel object, we generated XML files which is supported by Office XP. Excel file can be saved to XML format or HTML format in Office XP. We have also thought about the option of generating HTML files instead of XML files. Even though HTML is an ideal file format for displaying Excel data in Web browsers, in order to persist Excel-specific features and have Excel workbooks displayed well in a Web browser, Excel HTML is highly complex. This results in files that are only read by Excel. XML helps separate data and view formatting, which results in a cleaner, more robust file format and, more importantly, makes Excel data more available to other applications.
We have decided to (or forced to) experiment with the new approach. All what is happening is good for the World; we found the new approach the most scalable and performance driven approach. We could generate reports which contain fairly large amount of data (more than 10,000 records) with in 30-40 seconds.
Features of XML Approach
Some of the features of the XML implementation in Excel and the Microsoft Office XP Spreadsheet component include:
- "Round-tripping" information through Excel and the Spreadsheet component
- Opening non-Excel, non-Spreadsheet component generated XML data files
- Opening and "flattening" arbitrary, well-formed XML
- Opening hand-authored XML Spreadsheet documents
- Publishing data from Excel to the Spreadsheet component
- Exporting data from the Spreadsheet component to Excel
- Copying and pasting data from Excel to the Spreadsheet component, and vice versa
- Using XML-based QueryTable objects in Excel
- Persisting Range objects as XML in Excel and the Spreadsheet component
Limitations of XML Approach
- It doesn't support Excel macros.
There may me more limitations; but for generating fairly complex and well formed reports, XML approach is the best one. Since the code for generating XML files involves lot of repeatable code, we recommend to implement XML generation part as a class library so that it can be used in similar projects; saving your time and cost.
4. Security
Security is one of the key points to consider while architecting a reporting system. Managing security of the data exposed by the reports is a vital part. .NET Framework's robust support for Active Directory (AD) allows developers to easily implement security across the reporting system by applying AD permissions to report files themselves and granting permission to report processing objects only to those groups that should have access to specific data. Then the administration of all security - including the reporting system - can be managed by administering group membership.
While creating Excel reports in .NET, Office Automation is not the right approach. Still you prefer to use it, think serious about security loop holes exposed by your reporting architecture. Manage the permissions to anonymous account and ASPNET account very carefully.
One of the security approaches done in large security oriented database systems is to have a separate reporting server; in addition to the deployment server. It improves performance and security of the data. The data will be exposed to the out side world as views and these views will follow SQL/Windows authentication mechanisms.
5. Conclusion
Selecting the right tool for report generation depends on the type of application and cost overhead. The architecture to be decided, the component to be used etc are dependent on your deployment strategy and application architecture.
Think about the deployment strategy before deciding a reporting strategy. While generating Excel reports it is not wise to use Excel automation. Generating Excel reports as XML files sounds better in terms of scalability and performance. PDF and Crystal report generation are not difficult jobs in .NET; but be careful about the licensing issues in a distributed environment. Security in reports can be achieved through Active Directory Services and IIS.
References
http://www.businessobjects.com
http://www.datadynamics.com/default.aspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlsmlinss.asp
http://www.microsoft.com/sql/reporting/default.asp
MSDN Articles
Glossary