C# & .NET questions that interviewers get wrong
Author: Douglas Williams Created: Modified: Tags: C#, .NET
Many experienced C# & .NET developers have, at some point in their lives, attended an interview where they were asked a question whose answer is trickier than the interviewer assumes. The below is a compilation of such questions, their typical expected answers, and the actual correct answers.
foreach
Question: Which interface does a type need to implement to be used in a foreach
statement?
Expected answer: IEnumerable
or IEnumerable<T>
.
(Bonus: From C# 8.0, IAsyncEnumerable<T>
can be used with await foreach
.)
Correct answer: A type does not need to implement any interface.
It needs to have a public parameterless GetEnumerator
method,
whose return type has a public Current
property getter
and a public parameterless MoveNext
method whose return type is bool
.
These requirements are met by the popular IEnumerable
and IEnumerable<T>
interfaces.
(Source: foreach (C# reference) on Microsoft Learn.)
Explanation:
There are several C# keywords that are ubiquitously associated with specific interfaces:
foreach
with IEnumerable
or IEnumerable<T>
;
using
with IDisposable
; and
await
with Task
or Task<TResult>
.
(C# 7.0 and later versions have extended the above significantly.)
This leads many developers to assume that a type must implement one of the said interfaces to be valid for the given keyword.
However, this is not always the case.
It turns out that the C# compiler performs pattern-matching – a form of duck typing – for some of these keywords, including foreach
.
Here is a simple class that enumerates through the natural numbers from 1 to 10. It implements the foreach
pattern, but no interfaces.
public class NumericSequence
{
public NumericEnumerator GetEnumerator()
{
return new NumericEnumerator();
}
public class NumericEnumerator
{
public int Current { get; private set; }
public bool MoveNext()
{
Current++;
return Current <= 10;
}
}
}
Instances of this class may be used in a foreach
statement:
foreach (var x in new NumericSequence())
Console.Write(x + " ");
// Output: 1 2 3 4 5 6 7 8 9 10
Eric Lippert has a good article explaining this: Following the pattern.
The main reason for this design decision had been performance-related.
C# 1.0 did not have generics, so only IEnumerable
was available, not IEnumerable<T>
.
The IEnumerable
interface has a GetEnumerator
method whose return type is IEnumerator
.
The IEnumerator
interface has a Current
property whose type is object
.
Thus, when iterating over a collection of value types (such as int
),
each item accessed through this property would get boxed, incurring a performance penalty.
By using the pattern-matching approach, enumerator types could implement correctly typed Current
properties that avoid this boxing.
From the other examples cited above, it is worth noting that await
follows a similar pattern-based requirement,
as explained in Awaitable expressions.
However, using
does require a type to implement the IDisposable
interface – the Dispose
method by itself does not suffice.
volatile
Question: Where would you use the volatile
keyword?
Expected answer: The volatile
keyword is used on fields that may be modified by multiple threads concurrently,
and ensures that the most up-to-date value is present in the field at all times.
Correct answer: In the vast majority of cases, you shouldn't use volatile
at all.
Instead, you should use higher-level thread synchronization constructs,
such as lock
, SemaphoreSlim
, and EventWaitHandle
;
or concurrent collections such as ConcurrentDictionary<TKey, TValue>
.
If you genuinely need to write low-level nonblocking code, you would use volatile
for fields
that should generate an acquire-fence on every read from that field, and a release-fence on every write to that field.
Such fields may be used for certain nonblocking concurrent patterns, such as safe publication,
which is typically used for lazy initialization.
Explanation:
The expected answer is adapted from the Microsoft documentation for volatile (C# Reference) (before it was corrected around 2018):
The
volatile
keyword indicates that a field might be modified by multiple threads that are executing at the same time. Fields that are declaredvolatile
are not subject to compiler optimizations that assume access by a single thread. This ensures that the most up-to-date value is present in the field at all times.The
volatile
modifier is usually used for a field that is accessed by multiple threads without using thelock
statement to serialize access.
The above documentation – particularly, the statement that “the most up-to-date value is present in the field at all times” – is fundamentally flawed, as demonstrated by Joseph Albahari in Threading in C#, one of the best resources on the topic. The fact that the Microsoft documentation itself got the definition wrong is clear evidence of the confusion surrounding this keyword. Albahari provides a correct definition:
The
volatile
keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence. These “half-fences” are faster than full fences because they give the runtime and hardware more scope for optimization.
The C# Memory Model in Theory and Practice,
by Igor Ostrovsky, discusses the patterns where volatile
may be correctly used. The main pattern for which volatile
was intended is called “safe publication”.
Consider the example it provides:
public class DataInit
{
private int _data = 0;
private /* volatile */ bool _initialized = false;
public void Init()
{
_data = 42; // Write 1
_initialized = true; // Write 2
}
public void Print()
{
if (_initialized) // Read 1
Console.WriteLine(_data); // Read 2
else
Console.WriteLine("Not initialized");
}
}
If Init
and Print
are executed concurrently on different threads for the same instance of the DataInit
class,
then one would normally expect either 42
or Not initialized
to be printed.
However, the .NET memory model allows memory operations to be reordered when observed from another thread.
In the example above, the thread executing Print
may observe _initialized
being set to true
before it observes _data
being set to 42
,
and thus end up printing 0
.
The safe publication pattern entails declaring the _initialized
field as volatile
.
This would prevent its write from being reordered with the prior write to _data
,
and its read from being reordered with the subsequent read of _data
.
In practice, _data
would typically be a complex type with multiple fields that need to get initialized too.
This pattern still guarantees that _data
is observed either as fully initialized or as not initialized at all.
Igor calls this pattern “Publication via Volatile Field”, describes it as “so important that the semantics of the volatile keyword were designed around it”.
He recommends using it to remember the semantics of the volatile
keyword instead of the abstract rules defined earlier.
He goes on to demonstrate that lazy initialization – another popular pattern where volatile
commonly gets used – is just a variant of this pattern.
Instantiating interfaces
Question: Can you use the new
operator on an interface type?
Expected answer: No. You can only use new
to instantiate non-abstract classes or value types.
Correct answer: Yes.
You can use new
on interfaces decorated with all three of the following attributes: CoClassAttribute
, GuidAttribute
, ComImportAttribute
.
The CoClassAttribute
identifies the concrete type that implements the interface and that gets instantiated behind the scenes.
These attributes are associated with COM Interop, and should not be used elsewhere.
Explanation:
This feature is mainly used as plumbing for COM Interop types, including the Microsoft Office primary interop assemblies. For example, the Outlook Application interface is backed by the concrete class named ApplicationClass:
[ComImport]
[GuidAttribute("00063001-0000-0000-C000-000000000046")]
[CoClassAttribute(typeof(ApplicationClass))]
public interface Application : _Application, ApplicationEvents_11_Event
Consumers are recommended to use the Application
interface rather than the ApplicationClass
class;
see Get and sign in to an instance of Outlook.
These attributes are associated with COM Interop, and should not be used elsewhere.
However, developers who have used Office interop are likely to have come across this oddity of using new
on interface types
and questioned how this works.
For the sake of demonstration, the following example shows how these attributes may be applied:
[ComImport]
[Guid("175EB158-B655-11E1-B477-02566188709B")] // random GUID
[CoClass(typeof(Foo))]
interface IFoo
{
string Bar();
}
class Foo : IFoo
{
public string Bar()
{
return "Hello world";
}
}
Using the above definitions, one can use new
on the IFoo
interface:
var a = new IFoo();
Console.WriteLine(a.Bar());
// Output: "Hello world"
Marc Gravell warns that this feature should be treated merely as a curiosity, as it relies on a Microsoft-specific extension to the C# specification, and should not be expected to work on other implementations of the compiler.
Recursive Fibonacci
Question: Write a method that computes the nth term of the Fibonacci sequence recursively.
Expected answer:
public static long Fibonacci(int n)
{
if (n == 0) return 0;
if (n == 1) return 1;
return Fibonacci(n - 1) + Fibonacci(n - 2);
}
Better answer:
public static long Fibonacci(int n)
{
static long FibonacciInner(int n, long prev, long curr)
{
if (n == 0) return prev;
return FibonacciInner(n - 1, curr, curr + prev);
}
return FibonacciInner(n, 0, 1);
}
Explanation:
The simple answer provided above is a naive implementation that produces the correct result.
However, its performance would be abysmal for large numbers, and demonstrates a lack of understanding or consideration of code efficiency.
Each recursive call that does not hit the base cases (0 and 1) results in a further two calls.
For example, Fibonacci(9)
would need to compute Fibonacci(8)
and Fibonacci(7)
,
which in turn would respectively need to compute Fibonacci(7)
and Fibonacci(6)
, and Fibonacci(6)
and Fibonacci(5)
,
which in turn would collectively need to make eight calls, then 16 calls, then 32 calls, and so on, until the base cases are hit.
The time complexity is Θ(φn), where φ is the golden ratio, whose value is approximately 1.618.
Fibonacci(10)
would require 177 recursive calls;
Fibonacci(20)
would require 21,891;
Fibonacci(30)
would require 2,692,537; and
Fibonacci(40)
would require 331,160,281.
One can easily see that there is a lot of repetition across the multiple recursion (Fibonacci(n - 1)
and Fibonacci(n - 2)
).
To optimize the algorithm, one should convert it into single recursion (also called tail recursion).
To accomplish this, it is easier to start from the iterative implementation:
public static long Fibonacci(int n)
{
if (n == 0) return 0;
if (n == 1) return 1;
var (prev, curr) = (0L, 1L);
for (int i = 2; i <= n; i++)
(prev, curr) = (curr, curr + prev);
return curr;
}
In each iteration, the variable for the previous value, prev
, is assigned the current value from the previous iteration;
whilst the variable for the current value, curr
, is assigned the sum of the current and previous values from the previous iteration.
The assignment is performed using value tuples so as to avoid the need for a temporary variable.
This solution may be converted into a recursive formulation by replacing the local variables by parameters.
In either case, the time complexity now becomes Θ(n), and the space complexity is merely Θ(1).
Fibonacci(n)
would only ever require n
recursive calls.
An added advantage of tail recursion is that it can be subject to a compiler optimization known as tail call recursion. In general, each recursive call requires a new stack frame to be added to the call stack. However, when the recursive call is the last operation executed by the current function call, then the stack frame of the current function call is no longer needed, and may be repurposed for the tail call, avoiding the need for adding another stack frame for the latter. This optimization would not be possible for the naive answer due to its multiple recursion.
Recursive functions are a staple of functional programming languages. They are less popular in C-based languages, such as C#, where iterative implementations of algorithms tend to outperform their recursive counterparts. One advantage that some functional languages have over C# is that they support automatic memoization. This means that the runtime would store the results of function calls for given inputs, and return the cache results when the same inputs occur again. (This is only possible if the functions are referentially transparent, meaning they have no side-effects.) C# does not perform automatic memoization. However, it is possible to implement manual memoization using a collection:
public static long Fibonacci(int n)
{
var cached = new long[n + 1];
long FibonacciInner(int n)
{
if (n == 0) return 0;
if (n == 1) return 1;
if (cached[n] == 0)
cached[n] = FibonacciInner(n - 1) + FibonacciInner(n - 2);
return cached[n];
}
return FibonacciInner(n);
}
This is also an acceptable answer, as its time complexity is still Θ(n), but the space complexity is increased to Θ(n).
Note that the long
type is only suitable for holding results up to Fibonacci(92)
, which is 7,540,113,804,746,346,429.
For larger terms, the BigInteger
type should be used instead.
It is also worth noting that there are even faster algorithms for computing terms of the Fibonacci sequence, with time complexities as low as Θ(log n);
see Fast Fibonacci algorithms.
However, these would not be expected to be produced in a C# interview (other than in a mathematical domain).