C# & .NET questions that interviewers get wrong

Many experienced C# & .NET developers have, at some point in their lives, attended an interview where they were asked a question whose answer is trickier than the interviewer assumes. The below is a compilation of such questions, their typical expected answers, and the actual correct answers.

foreach

Question: Which interface does a type need to implement to be used in a foreach statement?

Expected answer: IEnumerable or IEnumerable<T>. (Bonus: From C# 8.0, IAsyncEnumerable<T> can be used with await foreach.)

Correct answer: A type does not need to implement any interface. It needs to have a public parameterless GetEnumerator method, whose return type has a public Current property getter and a public parameterless MoveNext method whose return type is bool. These requirements are met by the popular IEnumerable and IEnumerable<T> interfaces. (Source: foreach, in (C# reference) on Microsoft Docs.)

Explanation:

There are several C# keywords that are ubiquitously associated with specific interfaces: foreach with IEnumerable or IEnumerable<T>; using with IDisposable; and await with Task or Task<TResult>. (C# 7.0 and later versions have extended the above significantly.) This leads many developers to assume that a type must implement one of the said interfaces to be valid for the given keyword. However, this is not always the case. It turns out that the C# compiler performs pattern-matching – a form of duck typing – for some of these keywords, including foreach.

Here is a simple class that enumerates through the natural numbers from 1 to 10. It implements the foreach pattern, but no interfaces.

public class NumericSequence
{
    public NumericEnumerator GetEnumerator()
    {
        return new NumericEnumerator();
    }

    public class NumericEnumerator
    {
        public int Current { get; private set; }

        public bool MoveNext()
        {
            Current++;
            return Current <= 10;
        }
    }
}

Instances of this class may be used in a foreach statement:

foreach (var x in new NumericSequence())
    Console.Write(x + " ");

// Output: 1 2 3 4 5 6 7 8 9 10 

Eric Lippert has a good article explaining this: Following the pattern. The main reason for this design decision had been performance-related. C# 1.0 did not have generics, so only IEnumerable was available, not IEnumerable<T>. The IEnumerable interface has a GetEnumerator method whose return type is IEnumerator. The IEnumerator interface has a Current property whose type is object. Thus, when iterating over a collection of value types (such as int), each item accessed through this property would get boxed, incurring a performance penalty. By using the pattern-matching approach, enumerator types could implement correctly typed Current properties that avoid this boxing.

From the other examples cited above, it is worth noting that await follows a similar pattern-based requirement, as explained in Awaitable expressions. However, using does require a type to implement the IDisposable interface – the Dispose method by itself does not suffice.

volatile

Question: Where would you use the volatile keyword?

Expected answer: The volatile keyword is used on fields that may be modified by multiple threads concurrently, and ensures that the most up-to-date value is present in the field at all times.

Correct answer: In the vast majority of cases, you shouldn't use volatile at all. Instead, you should use higher-level thread synchronization constructs, such as lock, SemaphoreSlim, and EventWaitHandle; or concurrent collections such as ConcurrentDictionary<TKey, TValue>. If you genuinely need to write low-level nonblocking code, you would use volatile for fields that should generate an acquire-fence on every read from that field, and a release-fence on every write to that field. Such fields may be used for certain nonblocking concurrent patterns, such as safe publication, which is typically used for lazy initialization.

Explanation:

The expected answer is adapted from the Microsoft documentation for volatile (C# Reference) (before it was corrected around 2018):

The volatile keyword indicates that a field might be modified by multiple threads that are executing at the same time. Fields that are declared volatile are not subject to compiler optimizations that assume access by a single thread. This ensures that the most up-to-date value is present in the field at all times.

The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access.

The above documentation – particularly, the statement that “the most up-to-date value is present in the field at all times” – is fundamentally flawed, as demonstrated by Joseph Albahari in Threading in C#, one of the best resources on the topic. The fact that the Microsoft documentation itself got the definition wrong is clear evidence of the confusion surrounding this keyword. Albahari provides a correct definition:

The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence. These “half-fences” are faster than full fences because they give the runtime and hardware more scope for optimization.

The C# Memory Model in Theory and Practice, by Igor Ostrovsky, discusses the patterns where volatile may be correctly used. The main pattern for which volatile was intended is called “safe publication”. Consider the example it provides:

public class DataInit 
{
    private int _data = 0;
    private /* volatile */ bool _initialized = false;

    public void Init() 
    {
        _data = 42;            // Write 1
        _initialized = true;   // Write 2
    }

    public void Print() 
    {
        if (_initialized)              // Read 1
            Console.WriteLine(_data);  // Read 2
        else
            Console.WriteLine("Not initialized");
    }
}

If Init and Print are executed concurrently on different threads for the same instance of the DataInit class, then one would normally expect either 42 or Not initialized to be printed. However, the .NET memory model allows memory operations to be reordered when observed from another thread. In the example above, the thread executing Print may observe _initialized being set to true before it observes _data being set to 42, and thus end up printing 0.

The safe publication pattern entails declaring the _initialized field as volatile. This would prevent its write from being reordered with the prior write to _data, and its read from being reordered with the subsequent read of _data. In practice, _data would typically be a complex type with multiple fields that need to get initialized too. This pattern still guarantees that _data is observed either as fully initialized or as not initialized at all.

Igor calls this pattern “Publication via Volatile Field”, describes it as “so important that the semantics of the volatile keyword were designed around it”. He recommends using it to remember the semantics of the volatile keyword instead of the abstract rules defined earlier. He goes on to demonstrate that lazy initialization – another popular pattern where volatile commonly gets used – is just a variant of this pattern.

Instantiating interfaces

Question: Can you use the new operator on an interface type?

Expected answer: No. You can only use new to instantiate non-abstract classes or value types.

Correct answer: Yes. You can use new on interfaces decorated with all three of the following attributes: CoClassAttribute, GuidAttribute, ComImportAttribute. The CoClassAttribute identifies the concrete type that implements the interface and that gets instantiated behind the scenes. These attributes are associated with COM Interop, and should not be used elsewhere.

Explanation:

This feature is mainly used as plumbing for COM Interop types, including the Microsoft Office primary interop assemblies. For example, the Outlook Application interface is backed by the concrete class named ApplicationClass:

[ComImport]
[GuidAttribute("00063001-0000-0000-C000-000000000046")]
[CoClassAttribute(typeof(ApplicationClass))]
public interface Application : _Application, ApplicationEvents_11_Event

Consumers are recommended to use the Application interface rather than the ApplicationClass class; see Get and sign in to an instance of Outlook.

These attributes are associated with COM Interop, and should not be used elsewhere. However, developers who have used Office interop are likely to have come across this oddity of using new on interface types and questioned how this works. For the sake of demonstration, the following example shows how these attributes may be applied:

[ComImport]
[Guid("175EB158-B655-11E1-B477-02566188709B")]   // random GUID
[CoClass(typeof(Foo))]
interface IFoo
{
    string Bar();
}

class Foo : IFoo
{
    public string Bar()
    {
        return "Hello world"; 
    }
}

Using the above definitions, one can use new on the IFoo interface:

var a = new IFoo();
Console.WriteLine(a.Bar());

// Output: "Hello world"

Marc Gravell warns that this feature should be treated merely as a curiosity, as it relies on a Microsoft-specific extension to the C# specification, and should not be expected to work on other implementations of the compiler.

Recursive Fibonacci

Question: Write a method that computes the nth term of the Fibonacci sequence recursively.

Expected answer:

public static long Fibonacci(int n)
{
    if (n == 0) return 0;
    if (n == 1) return 1;

    return Fibonacci(n - 1) + Fibonacci(n - 2);
}

Better answer:

public static long Fibonacci(int n)
{
    static long FibonacciInner(int n, long prev, long curr)
    {
        if (n == 0) return prev;
        return FibonacciInner(n - 1, curr, curr + prev);
    }

    return FibonacciInner(n, 0, 1);
}

Explanation:

The simple answer provided above is a naive implementation that produces the correct result. However, its performance would be abysmal for large numbers, and demonstrates a lack of understanding or consideration of code efficiency. Each recursive call that does not hit the base cases (0 and 1) results in a further two calls. For example, Fibonacci(9) would need to compute Fibonacci(8) and Fibonacci(7), which in turn would respectively need to compute Fibonacci(7) and Fibonacci(6), and Fibonacci(6) and Fibonacci(5), which in turn would collectively need to make eight calls, then 16 calls, then 32 calls, and so on, until the base cases are hit. The time complexity is Θ(φn), where φ is the golden ratio, whose value is approximately 1.618. Fibonacci(10) would require 177 recursive calls; Fibonacci(20) would require 21,891; Fibonacci(30) would require 2,692,537; and Fibonacci(40) would require 331,160,281.

One can easily see that there is a lot of repetition across the multiple recursion (Fibonacci(n - 1) and Fibonacci(n - 2)). To optimize the algorithm, one should convert it into single recursion (also called tail recursion). To accomplish this, it is easier to start from the iterative implementation:

public static long Fibonacci(int n)
{
    if (n == 0) return 0;
    if (n == 1) return 1;

    var (prev, curr) = (0L, 1L);
    for (int i = 2; i <= n; i++)
        (prev, curr) = (curr, curr + prev);

    return curr;
}

In each iteration, the variable for the previous value, prev, is assigned the current value from the previous iteration; whilst the variable for the current value, curr, is assigned the sum of the current and previous values from the previous iteration. The assignment is performed using value tuples so as to avoid the need for a temporary variable. This solution may be converted into a recursive formulation by replacing the local variables by parameters. In either case, the time complexity now becomes Θ(n), and the space complexity is merely Θ(1). Fibonacci(n) would only ever require n recursive calls.

An added advantage of tail recursion is that it can be subject to a compiler optimization known as tail call recursion. In general, each recursive call requires a new stack frame to be added to the call stack. However, when the recursive call is the last operation executed by the current function call, then the stack frame of the current function call is no longer needed, and may be repurposed for the tail call, avoiding the need for adding another stack frame for the latter. This optimization would not be possible for the naive answer due to its multiple recursion.

Recursive functions are a staple of functional programming languages. They are less popular in C-based languages, such as C#, where iterative implementations of algorithms tend to outperform their recursive counterparts. One advantage that some functional languages have over C# is that they support automatic memoization. This means that the runtime would store the results of function calls for given inputs, and return the cache results when the same inputs occur again. (This is only possible if the functions are referentially transparent, meaning they have no side-effects.) C# does not perform automatic memoization. However, it is possible to implement manual memoization using a collection:

public static long Fibonacci(int n)
{
    var cached = new long[n + 1];

    long FibonacciInner(int n)
    {
        if (n == 0) return 0;
        if (n == 1) return 1;

        if (cached[n] == 0)
            cached[n] = FibonacciInner(n - 1) + FibonacciInner(n - 2);
        return cached[n];
    }

    return FibonacciInner(n);
}

This is also an acceptable answer, as its time complexity is still Θ(n), but the space complexity is increased to Θ(n).

Note that the long type is only suitable for holding results up to Fibonacci(92), which is 7,540,113,804,746,346,429. For larger terms, the BigInteger type should be used instead. It is also worth noting that there are even faster algorithms for computing terms of the Fibonacci sequence, with time complexities as low as Θ(log n); see Fast Fibonacci algorithms. However, these would not be expected to be produced in a C# interview (other than in a mathematical domain).