static void

Porting to Dotnet Core

Published Saturday 19 November 2016

.net Core is new and shiny. It discards a lot of the historical cruft that has built up in the full .net framework. It is fast and simple, but the downside is that many existing projects cannot practically be updated. Library projects with pure business logic are likely to be upgradable, but you'll soon hit problems with dependencies like data access or WCF. I've got an old data access helper library which would be very useful in .net Core. It provides functionality that .net Core misses- accessing database schema metadata (names of tables and columns). The problem is that it uses ADO dataTables, and those have been purged from Core. So this is how I upgraded it...

A long time ago in a galaxy far far away...

Back in 2005 Microsoft released .net 2.0. One of the nice new features was ADO DbProviderFactories. You could write generic ADO code where the actual database and driver was decided purely by configuration. It also provided schema information about the database (tables, columns etc). Good idea, but it was all in ugly DataTables and each database was different- Oracle had different columns to SqlServer.

So I created a little library which read the datatables into a standard model of plain .net objects (original 2006-vintage code here - there were just 5 classes). Over the years it evolved, adding more databases as I encountered them- some pretty rare stuff like Sybase, Intersystems Cache and VistaDb. Eventually, in 2010, I put it on Codeplex as the DatabaseSchemaReader (now on Github). The nuget package is nearly 10,000 downloads now, so it's not just me who finds it useful.

Now in 2016, Microsoft has released .net Core 1.0. Core is a complete rewrite of .net framework. DataTables are gone (good), but the DbProviderFactories are gone too (not so good). Sometime in the future there will be a replacement for GetSchema (see issue https://github.com/dotnet/corefx/issues/5024), but it's not in the RTM, and the proposal is very limited.

DatabaseSchemaReader to the rescue.

Step 1- Refactor

The first step was a thorough refactoring, not looking at .net Core at all. Fortunately a large number of unit and integration tests was a huge help here.

Some of the code was 10 years old, so I could clear up a lot of legacy. I had obsolete properties that I could get rid off. There were parallel VS2008 and VS2010 solutions/projects, so goodbye 2008 and welcome VS2015 by default. The unit tests were MsTest/NUnit switchable, which had seemed like a good idea at the time (what was I thinking?).

The major refactoring was re-structuring the database reading. I have to remove (or at least move) the DataTables and DbProviderFactories under the covers. The GetSchema implementations in database clients tend to be slow and inefficient, so for the most common databases I could replace entirely with my own, more efficient code.

After the refactoring, I still had DataTables and the provider factories for the less common databases. Now they had been moved into specific folders/namespaces. These would be included only for the .net 4.6 and older frameworks; .net Core would not use them. It is pretty unlikely we'll see an Intersystems Cache or Informix client for .net Core any time soon, and we can safely leave them on the full .net framework.

Meanwhile, I wrote simple ADO queries for SqlServer, Oracle, MySql, PostgreSql and SQLite. The integration tests proved they worked (a few bugs sneaked through). Only SqlServer, SQLite and PostgreSql had .net Core clients at the time, but I would be ready for MySql and Oracle whenever they released.

Creating a .net Core project

There is a guide to "porting from .net framework to .net core". The ".net portability analyzer" is a useful Visual Studio plugin for checking compatibility.

The easiest upgrade path is to create a project.json and a {project}.xproj in the same directory as the {project}.csproj (see my post from May). A simple library can have a very simple project.json, although getting the format right is still a challenge (too many RC1-RC2 examples on the web, little up-to-date documentation). You can even make it multi-target, from .net framework 3.5 up to .net Core's netstandard1.6.

Note that from Visual Studio 2017 (now in RC1 as from 16 November 2016), project.json will disappear and we'll go back to msbuild (that tooling is also just released in alpha-quality). There's even automaic migration from json to xml. I'll look at this again, at RTM or what looks like a stable beta.

Since VS2010, solution and csproj files have been mostly forward and backward compatible. VS2008 wasn't, which is why I kept the shadow 2008 projects around for so long. Unfortunately the new msbuild is a breaking change, so VS 2017 only. It takes years for Visual Studio versions to be adopted- there are plenty of companies still on 2012 or 2013 as a corporate policy. They can use the library (via nuget), but they can't take the source.

{
    "version": "1.0.0",
    "packOptions": {
        "summary": "..."
        //all the nuspec stuff
    },
    "buildOptions": {
        "xmlDoc": true
    },
    "frameworks": {
        "netstandard1.5": {
            "dependencies": {
                "NETStandard.Library": "1.6.0"
            }
        },
        "net35": { },
        "net40": { },
        "net46": { }
    }
}

Here we target netstandard1.5. netstandard1.6 is the fullest API at RTM (June 2016). Lower numbers (down to 1.0) reach more platforms (Windows Phone! Silverlight!), but have less API. I made the first version during RC2, which was netstandard1.5. There were a handful of issues when I tried 1.3, which supports the Universal Windows Platform. As all UWP can access is SQLite, it didn't seem worth the effort, so it's still 1.5.

The dependency "NETStandard.Library" here is the nuget package containing the APIs for netstandard1.0+. It's a little confusing that netstandard1.0 - netstandard1.5 are in library 1.6 (don't even look at the old dnx stuff).

Hiding the old stuff

By default, project.json compiles all .cs files. But you can exclude folders and code files from specific frameworks.

    "frameworks": {
        "netstandard1.5": {
            "buildOptions": {
                "define": [ "COREFX" ],
                "compile": {
                    "exclude": [
                        "FullFrameworkStuff/**/*.cs"
                    ]
                }
            },
            "dependencies": { ...

Previously I'd refactored the project so all the old dataTable stuff was in specific folders. They are still there, but .net Core doesn't see them any more, thanks to these exclusions.

Test projects

Test projects are a little different in Core- they are actually consoles, not libraries (so framework is netcoreapp1.0, not netstandard). Your choice of test frameworks include XUnit, NUnit and MsTest. I had a boring old MsTest project, but since RTM I had not been able to get Visual Studio to discover tests (even though the project builds). Command line works ("dotnet test"). Converting to XUnit or NUnit is yak shaving. Resharper didn't recognise Core at all. With the immature tooling (we're still in "preview2"), I could not do test driven development for .net core in Visual Studio. I only started to see usable test projects in August, after several updates to the Visual Studio tooling (almost weekly at one stage...) and Resharper's 2016.2 update.

For development, I decided to stay in .net full framework, where I had literally hundreds of tests. I kept the separate Core solution with a minimal MsTest project, which I ran only in command line. For the continuous integration build script (on AppVeyor), I built the full framework solution and ran the tests, then added an "After build" step for the "dotnet pack" command, which would build Core from the project.json.

Compiler directive hell

The main API of DatabaseSchemaReader follows the underlying DbProviderFactories in taking a provider name, ultimately defined in configuration. The factories then instantiate the ADO connection object for the database.

var dr = new DatabaseReader(connectionString, "System.Data.SqlClient");
var schema = dr.ReadAll();

That's not possible in .net Core, and the philosophy is now dependency injection and configuration in code, not xml. To get the ADO connection, we should pass it in, not try to create it.

var dbConnection = new SqlConnection(connectionString); //could also be Sqlite etc
var dr = new DatabaseReader(dbConnection);
var schema = dr.ReadAll();

For .net 4.5 and below, we want to see the old-style constructors. For .net Core, we want the new constructors. We can use pre-compiler directives.

#if NEW
public DatabaseReader(DbConnection dbConnection) { ...}
public DatabaseReader(string connectionString, string providerName) { ...}
#endif

The new dotnet compiler automatically creates directives with the name of the framework, upper-cased, underscores for dots (net35 is NET35, netstandard1.5 is NETSTANDARD1_5). You can also define directives in project.json; rather than refer to a specific NetStandard, I assigned "COREFX".

Sometimes a quick shim solves a problem. I used a lot of [Serializable] attributes all over to explicitly mark classes as binary serializable. .net Core doesn't have binary serialization (yet) or the SerializableAttribute. An empty class wrapper in compiler directives solves the problem.

#if NETSTANDARD1_5
using System;
namespace MyLibrary
{
    public class SerializableAttribute : Attribute { }
}
namespace System.Runtime.Serialization { }
#endif

One of the big annoyances is how accessing type properties has been changed. You have to do things like this:

#if NETSTANDARD1_5
            var typeInfo = type.GetTypeInfo();
            var isValue = typeInfo.IsValueType;
#else
            var isValue = type.IsValueType;
#endif

Or this:

#if !COREFX
            var bindByName = command.GetType().GetProperty("BindByName");
#else
            var bindByName = command.GetType().GetTypeInfo().GetDeclaredProperty("BindByName");
#endif

It can take a bit of refactoring to get these directives manageable.

The Stored Procedure Problem

Databases don't give you metadata on the output of a stored procedure- what column names and what types? While GetSchema gives you metadata for the database tables and columns, what we need here is the result set of an arbitary query. In .net full framework, the DataTable gives you that information (see also dbDataReader.GetSchemaTable()).

In .net Core DbDataReader has GetSchemaTable exists but throws a NotSupportedExtension (and the DataTable type is an empty stub). There's a proposal to have a dbDataReader.GetColumnSchema() which gives a readonly collection of DbColumns with strongly typed metadata about column names and types. Nice, but the issue is closed and was never released (and if it doesn't get ported into the full framework, it will be awkward). It looks like the decision has been made to bring back the DataTable for .netStandard 2.0.

Without these bits, I can't determine what stored procedures return in .net Core.

The Future Is Fantastic

The .net core roadmap from July 2016 seems to be accurate, with Visual Studio 2017 RC1 out in November 2016 and the MsBuild Alpha which replaces project.json with an improved csproj. Visual Studio 2017 RC has preview support, and should RTM with full support. VS2015 may be a dead-end for Core development after VS2017 RTMs early next year.

There was a lot to like about project.json, especially compared to the ugly and verbose msbuild xml. But json is tricky to edit, and documentation is sparce and confused by lots of versions. An improved, more concise msbuild syntax isn't necessarily a step back.

A lot of the missing APIs (such as the type stuff and binary serialization) will get into Core as "netstandard2.0" around "Q1/Q2 2017". These will make porting to .net core a more practical possibility.

Previously: Cloudflare And Github Pages (14 November 2016)