This is a blog about EU law, with a focus on free movement, public procurement and competition law issues. I use it to publish my thoughts on recent developments and to comment on selected Judgments of the Court of Justice of the EU.
Last week, I received the news that from 1 August 2019, I will be promoted to Professor of Economic Law at the University of Bristol Law School. These news have now sunk in and I am slowly getting used to the idea—as my grandma used to say, it is easy to get used to the nice things, getting used to the tough things takes some more character…
This promotion is the result of a very long administrative/HR process, which has given me some time to think about what a promotion to (full) professorship would mean and what I would like to make of my new position. It has also made me reflect on my academic career so far and on how I got here. These are some of my raw thoughts. Not sure if they will be of any interest, but I needed to spit them out.
How I got here
The short answer is that I got here by chance, with a lot of luck and even more help from mentors, colleagues, friends and students along the way. And, of course, thanks to the motivation that those interested in my scholarship have provided, particularly through this blog. I am indebted to all of them (you) and it would be impossible for me to come up with a list that covered even 10% of those generous individuals that lent me a helping hand, stimulated or challenged me intellectually, gave me platforms and opportunities to grow and, perhaps most important, made it all very enjoyable along the way.
There are a couple of other aspects of how I got here on which I have been thinking with particular intensity. The first one is that I got here as a privileged, middle-class, white, male academic with no family responsibilities. The second one is that I got here thanks to the brilliant mentoring offered by experienced female colleagues. Let me unpack this.
I enjoyed all privileges of someone that could study without having to work at the same time until I got to my PhD studies. I had a supportive family that really focused on my education from the get go and they not only put me through good schools, but also helped financially with the parts of my undergraduate studies not covered by my scholarship. I then had the privilege of working for big firms and my family’s support allowed me to save enough to self-fund my PhD. I did have to work during parts of my PhD, but mainly carrying out consultancy work to which I had access through contacts. So, all in all, I got to be a doctor (not a real one, a PhD in Law, I mean) and access academic posts thanks to the privileges I enjoyed from the day I was born, out of sheer luck. Sure, I did put in my 10,000 hours of dedication, but I did not have to overcome any significant obstacles.
I accessed academia in September of 2009 in Madrid and, despite the fact that the crisis had hit hard and there were not those many jobs, I was lucky enough to land a Lectureship at a private university—thanks, in no small part, to the contacts I had made during my time at the law firm and during my doctoral studies. I did not like the working conditions, though, so I took the gamble of moving to the UK—which I could do, in large part, thanks to the possibility of having studied languages since age 7 and having completed a European Doctorate, which saw me move to Copenhagen, Washington DC and Oxford during my doctoral studies (on my personal and family funds, which is certainly not the situation of most PhD students, either then or now).
I took my first UK lectureship in May 2012 and, just over 7 years later, I have been promoted three times and moved across three higher education institutions. This has required hard work and dedication, of course, but I have also been immensely helped by the continued enjoyment of my privileges as a middle-class, white, male academic with no family responsibilities. The last issue is the one giving me more to ponder.
Even if I am now an ‘academic dad’, my promotion application included information only until the early fall of 2018, so only just after my child was born. Thus, all work reflected in that application was done by someone without caring responsibilities. I was also extremely lucky to have a partner that understood my obsession with academia and my research and was willing to give me as much time and space as I needed to work very long hours, to travel (way too much, which I really regret and which I am seriously committed to change) and to get involved in all those extra citizenship and staff-student activities that are nigh impossible to coordinate with childcare or other types of care responsibilities.
I thought I needed to say all this because, in the right context, the fact that I got promoted at just under 40 years of age and at just under 10 years from having taken up my first lectureship, certainly does not look as impressive as comparable promotions of colleagues with very different backgrounds and personal circumstances. All of them, and all those facing difficult circumstances and discrimination in higher education (and elsewhere) have my deepest admiration and respect.
The second aspect of how I got here, which is somehow ironically related to the previous one, is that I have had the most amazing formal and informal mentors since I got to the UK and they were all experienced female colleagues. At every step of the way, but particularly since I joined the University of Bristol Law School, I have been enormously lucky to have the support and encouragement of truly great academics and generous colleagues that have helped me prioritise my work, present it in the best possible way, and constantly made me feel like I was worthy of whichever promotion or recognition I was seeking. I have also had some great male colleagues, but their commitment to nurture others, to help them grow and to enjoy doing so pales by comparison.
Interestingly enough, my mentors took me at face value and cared not about my being a privileged, middle-class, white, male academic with no family responsibilities. They were solely interested in my potential and by believing in it and making me exploit it, they have had a transformative impact in me as a scholar and as a person.
So, what type of professor do I want to be?
Needless to say, I want to continue carrying out research that I believe can have a positive social impact, and I want to remain committed to my open-access efforts to try to make my scholarship freely available to anyone interested in it. I also want to help my students learn and grow, and venture into the world with a critical perspective and a strong set of values. I want to be a good colleague and peer, and to treat those with whom I work with respect and with a sincere appreciation of their contribution to higher education.
Overall, however, I want to be a professor that enables other academics (including postgraduate students) to be the best version of themselves they can be, and a professor that does everything in their power to make higher education a better place. Again, let me unpack this, perhaps more concisely.
I want to emulate my female mentors. I want to be able to support willing and committed colleagues to blossom, regardless of their background and characteristics. I want to be committed to equality and fairness and to be able to set aside any prejudices and biases (conscious or unconscious). I want to put my seniority and whichever power or influence comes with it to the service of others, including where necessary to curb the unjustified privileges enjoyed by some at the expense of others. I do not want to shy away from difficult or conflict situations where I see an injustice being done.
I also want to make higher education a more enjoyable, more sustainable and more diverse environment for all of us working and studying in it. I want to contribute to an environment of non-instrumental intellectual curiosity and exchange. I want to make the best use of the ever-growing networking and connectivity opportunities we are offered to expand the reach of higher education and make it more accessible than ever. This is the part I still need to figure out, so I will welcome any suggestions on what needs to be done—either urgently, or in the longer run. For now, I will concentrate on sustainability issues and seek to influence others into adopting no/less-fly policies. It may not amount to much, but it is a start (for me).
Interest in the use of blockchain in the context of public procurement keeps rising by the day. It is hard to find a country where this is not a topic of discussion, although there seems to be a wide spectrum from enthusiastic and proactive approaches (eg in the UK, with the promotion of procurement-centred blockchain use cases by the All-Party Parliamentary Group on Blockchain) to more skeptical and wait-and-see approaches (in Scandinavian countries, eg Denmark or Sweden).
At the same time as some theoretical work starts to emerge—see eg Sope Williams-Elegbe’s exploratory inaugural lecture and Raquel Carvalho’s (not always very clear or accurate) recent paper—the need to get some practical insights in order to support theoretical speculation becomes all-important. However, accessing this information can be a little tricky, in particular if local or regional projects are only publicised in languages other than English.
So we organised a couple of webinars on the topic and asked participants to pool together any use cases they know of (and thanks to all of them for their contributions). In rough terms (and with apologies for any over-simplification), it looks like there are three main areas of experimentation:
Development of proof-of-concept / pilot projects seeking to tackle some parts of the procurement process, such as (a) initiatives on exclusion/selection of tenderers in Costa Rica and the Basque Country (Spain) and (b) initiatives on tender submission and evaluation by smart contracts in Aragon (Spain)
Development of proof-of-concept / pilot projects seeking to carry out the entire procurement process on the blockchain, such as in Mexico (federal level) and Cape Town (South Africa)
Development of ‘blockchain-like’ database approaches that seek to replicate some of the main features of a blockchain (in terms of data de-centralisation and tamper-evidence features), such as some projects run by the EBRD
We also learnt about other Govtech / Regtech applications of blockchain, such as the Finnish initiatives to provide bank cards to refugees and to centralise the exchange of information on mandatory motor vehicle insurance. There are also other well-known projects around property registers (eg for land and IP).
On the whole, though, it seems like the most promising potential applications of blockchain are those linked to information management/storage and the transfer of digital assets, and that there is more potential in those cases where there is no existing (working) database for their management. The difficulties of implementing blockchain-based solutions for not-super-simple procurement and off-chain aspects of procurement seem too high to overcome any time soon.
It also seems like that there is a certain tension between the promise of transparency associated with blockchain infrastructure and the other attributes of the technology (mainly, tamper-evidence qualities), at least where the design of the blockchain is heavily permissioned and centralised. Perhaps as a very European issue (but also more broadly), compliance with data protection rules also comes up as a legal hurdle in every other project.
If you know of any other blockchain use cases in procurement, or if you have any other views on the potential of this technology for procurement governance, please comment on this post or get in touch: email@example.com
This is a personal reflection on the unethical behaviour of those academics that abuse their time at the podium when speaking at conferences—or those that deliver uninvited pseudo-speeches while pretending to formulate questions or comments. This is based on real facts. I do not apologise for any extreme views expressed here.
If you are an academic or have ever attended an academic conference, you probably know what I am talking about. Unless the conference is well-organised and time is zealously kept by a dedicated moderator/chair (and those who do should be highly praised for what really is an ungrateful and rather stressful job), there is always (*always*) someone that abuses their time at the podium. I am not talking about one- or two-minute overruns, but rather about speakers that happily exceed their time by 20’ or so, usually oblivious to or wilfully ignoring any polite notes (less than) discreetly passed to them by the organisers.
In my personal experience, this tends to be perpetrated by a senior colleague that has barely prepared and/or is delivering the exact same presentation we have already heard, perhaps with a minor tweak. This would, even if delivered within the allocated time, in itself be reprehensible behaviour (but let’s save that for another day). It is, of course, possible that time is exceeded while discussing new ideas and arguments, but that does not substantially change the impact of time abuse (see below).
The culprit also tends to be a male colleague that would be highly insulted if someone (particularly someone junior, particularly a female colleague) tried to whip him into compliance with the programmed schedule and the usually also agreed upon rules for the panel/session/roundtable—you know, those previous emails that seek to avoid content overlap and that make it clear that you have 20, 12 or however many minutes to deliver your initial material before engaging in questions and answers.
Further, in my experience, this is also much more likely to happen in academic environments where seniority—and fear of the power of the seniors—drives group dynamics in academic conferences. That is, I have experienced this usually in Mediterranean countries (particularly Spain and Italy) and sometimes in continental Europe (notably, Germany). I have had the opposite experiences in the UK and Finland, and further afield, to say it all.
The effects of abusing time at the podium tend to always be the same and threefold: audience disengagement, squeezing or suppression of time for others’ views (be it other scheduled presentations or genuine debate) and a general lowering of the quality of the event. In my view, generating these results is unethical and disrespectful, and those abusing their time at the podium should be shamed into shutting up and sitting down. Moreover, organisers of academic events have a fiduciary duty towards all speakers and participants to ensure delivery of the programme as planned. Let me expand.
Those that overrun demonstrate their disregard for other people’s work and time. They disrespect colleagues scheduled to speak after them and their effort preparing their own speeches, practicing them and timing them to fit the allocated slot. Of course, not having done so themselves, they are probably also underestimating the effort that goes into that. They clearly do not think that their participation in the conference is part of a greater whole and that they too are there to learn.
They also disrespect the audience and ignore the abuse of power that social convention enables them to exert by having been put in the spotlight. However interesting the message, the audience is not there solely to listen to them. The audience is also not there as a simple receptacle of their own voice—unless the event was clearly advertised as not including any sort of time for Q&A or interaction whatsoever. Finally, they also disrespect and put significant pressure on the moderator/chair/organiser, as they are then forced to either become complicit in the unethical abuse or discharge their fiduciary duty—none of which is their preferred alternative.
On the fiduciary duty. Well, it is plain to me that organisers of an event and chairs/moderators (organisers for short) have, first and foremost, a fiduciary duty towards the audience. That duty is, simply put, to make everything possible to deliver the programme as planned. Of course, there can be unforeseen circumstances that make it difficult or impossible. In that case, all that can be done is to readjust things as best as possible. However, tolerating time overruns is not such an unforeseen situation.
Moreover, organisers have a fiduciary duty to all speakers to enable them to present their ideas and to deliver the result of their work and preparation. Funny enough, organisers are keener for this to happen when they pay the speakers than when they get the content for free, which is another f*&^ed aspect of the conferencing game (also best saved for another day). When organisers allow a/some speakers to abuse their time at the podium, they are simply telling all other speakers that their expertise and preparation are not as valuable as those of the perpetrator. Tertium non datur.
So, what’s the ethical approach to delivering a speech at a conference? As far as I can formulate it, I think this boils down to: showing up prepared, ready to present your best possible ideas and to deliver them to the best of your ability, within the allocated time, and being open to challenge and discussion, to engage with such exchanges, and to contribute to debates surrounding the contributions of other speakers and participants. Honestly, if someone is not ready to act ethically, I’d rather they stayed out of it. Whatever their expertise and brilliance. And the same goes for anyone organising these events. If not ready to run them ethically, then better not organise them at all.
If you got this far, you may be nodding in agreement. Or you may think that I am simply sour and should take it easier. Either way, thanks for reading.
The more I think about the use of blockchain solutions in the context of public procurement governance—and, more generally, of public services delivery—the more I find that the inability for blockchain technology to reliably connect to the ‘real world’ is bound to restrict any potentially useful applications to back-office functions and the procurement of strictly digital assets.
This is simply because blockchain can only generate its desirable effects of tamper-evident record-keeping and automated execution of smart contracts built on top of it to the extent that it does not require off-chain inputs. Blockchain is also structurally incapable of generating off-chain outputs by itself.
This is increasingly widely-known and is generating a sub-hype around oracles—which are devices aimed at plugging blockchains to the ‘real world’, either by feeding the blockchain with data, or by outputting data from the blockchain (as discussed eg here). In this blog post, I reflect on the minimal changes that I think the development of oracles is likely to have in the context of public procurement governance.
Why would blockchain be interesting in this context?
Generally, the potential for the use of blockchain and blockchain-enabled smart contracts to improve procurement governance is linked to the promise that it can help prevent corruption and mistakes through the automation of decision-making through the procurement process and the execution of public contracts and the immutability (rectius, tamper-evidence) of procurement records. There are two main barriers to the achievement of such improvements over current processes and governance mechanisms. One concerns transactions costs and information asymmetries (as briefly discussed here). The other concerns the massive gap between the virtual on-chain reality and the off-chain real world—which oracles are trying to bridge.
The separation between on-chain and off-chain reality is paramount to the analysis of governance issues and the impact blockchain can have. If blockchain can only displace the focus of potential corrupt or mistaken intervention—by the public buyer, or by public contractors—but not eliminate such risks, its potential contribution to a revolution of procurement governance certainly reduces in various orders of magnitude. So it is important to assess the extent to which blockchain can be complemented with other solutions (oracles) to achieve the elimination of points of entry for corrupt or mistaken activity, rather than their displacement or substitution.
Oracle’s vulnerabilities: my puppy wears my fitbit
In simple terms, oracles are data interfaces that connect a blockchain to a database or a source of data (for a taxonomy and some discussion, see here). This makes them potentially unreliable as (i) the oracle can only be as good as the data it relies on and (ii) the oracle can itself be manipulated. There are thus, two main sources of oracle vulnerability, which automatically translate into blockchain vulnerability.
First, the data can be manipulated—like when I prefer to sit and watch some TV rather than go for a run and tie my fitbit to my puppy’s collar so that, by midnight, I have still achieved my 10,000 daily steps.* Second, the oracle itself can be manipulated because it is a piece of software or hardware that can be tampered with, and perhaps in a way that is not readily evident and which uncovering requires some serious IT forensics—like getting a friend to crack fitbit’s code and add 10,000 daily steps to my database without me even needing to charge my watch.**
Unlilke when these issues concern the extent to which I lie to myself about my healthy lifestyle, these two vulnerabilities are highly problematic from a public governance perspective because, unless the data used in the interaction with the blockchain is itself automatically generated in a way that cannot be manipulated (and this starts to point at a mirror within a mirror situation, see below), the effect of implementing a blockchain plus oracle simply seems to be to displace the governance focus where controls need to be placed towards the source of the data and the devices used to collect it.
But oracles can get better! — sure, but only to deal with data
The sub-hype around oracles in blockchain discussions basically follows the same trend as the main hype around blockchain. The same way it is assumed that blockchain is bound to revolutionise everything because it will get so much better than it currently is, there are emerging arguments about the almost boundless potential for oracles to connect the real world to the blockchain in so much better ways. I do not have the engineering or futurology credentials necessary to pass judgement on this, but it seems to me plain to see that—unless we want to add an additional layer about robotics (and pretty evolved robotics at that), so that we consider blockchain+oracle+robot solutions—any and all advances will remain limited to improving the way data is generated/captured and exploited within and outside the blockchain.
So, for everything that is not data-based or data-transformable (such as the often used example of event tickets, which in the end get plugged back to a database that determines their effects in the real world)—or, in other words, where moving digital tokes around does not generate the necessary effects in the real world—even much advanced blockchain+oracle solutions are likely to remain of limited use in the context of procurement and the delivery of public services. Not because the applications are not (technically) possible, but because they generate governance problems that merely replace the current ones. And the advantage is not necessarily obvious.
How far can we displace governance problems and still reap some advantages?
What do I mean that the advantage is not necessarily obvious? Well, imagine the possibility of having a blockchain+oracle control the inventory of a given consumable, so that the oracle feeds information into the blockchain about the existing level of stock and about new deliveries made by the supplier, so that automated payments are made eg on a per available unit basis. This could be seen as a possible application to avoid the need for different ways of controlling the execution of the contract—or even for the need to procure the consumable in the first place, if a smart contract in the blockchain (the same, or a separate one) is automatically buying them on the basis of a closed system (eg a framework agreement or dynamic purchasing system based on electronic catalogues) or even in the ‘open market’ of the internet. Would this not be advantageous from a governance perspective?
Well, I think it would be a matter of degree because there would still need to be a way of ensuring that the oracle is not tampered with and that what the oracle is capturing reflects reality. There are myriad ways in which you could manipulate most systems—and, given the right economic incentives, there will always be attempts to manipulate even the most sophisticated systems we may want to put in place—so checks will always be needed. At this stage, the issue becomes one of comparing the running costs of the system. Unless the cost of the blockchain+oracle+new checks (plus the cybersecurity needed to keep them up and properly running) is lower than the cost of existing systems (including inefficiencies derived from corruption and mistakes), there is no obvious advantage and likely no public interest in the implementation of solutions based on these disruptive technologies.
Which leads me to the new governance issue that has started to worry me: the control of ‘business cases’ for the implementation of blockchain-based solutions in the context of public procurement (and public governance more generally). Given the lack of data and the difficulty in estimating some of the risks and costs of both the existing systems and any proposed new blockchain solutions, who is doing the math and on the basis of what? I guess convincingly answering this will require some more research, but I certainly have a hunch that not much robust analysis is going on…
* I do not have a puppy, though, so I really end up doing my own running…
** I am not sure this is technically doable, but hopefully it works for the sake of the example…
Researching the area of artificial intelligence and the law (AI & Law) has currently taken me to the complexities of natural language processing (NLP) applied to legal texts (aka legal text analytics). Trying to understand the extent to which AI can be used to perform automated legal analysis—or, more modestly, to support humans in performing legal analysis—requires (at least) a view of the current possibilities for AI tools to (i) extract information from legal sources (or ‘understand’ them and their relationships), (ii) assess their relevance to a given legal problem and (iii) apply the legal source to provide a legal solution to the problem (or to suggest one for human validation).
Of course, this obviates other issues such as the need for AI to be able to understand the factual situation to formulate the relevant legal problem, to assess or rank different legal solutions where available, or take into account additional aspects such as the likelihood of obtaining a remedy, etc—all of which could be tackled by fields of AI & Law different from legal text analytics. The above also ignores other aspects of ‘understanding’ documents, such as the ability for an algorithm to distinguish factual and legal issues within a legal document (ie a judgment) or to extract basic descriptive information (eg being able to create a citation based on the information in the judgment, or to cluster different types of provisions within a contract or across contracts)—some of which seems to be at hand or soon to be developed on the basis of the recently released Google ‘Document Understanding AI’ tool.
The latest issue of Artificial Intelligence and the Law luckily concentrates on ‘Natural Language Processing for Legal Texts’ and offers some help in trying to understand where things currently stand regarding issues (i) and (ii) above. In this post, I offer some reflections based on my understanding of two of the papers included in the special issue: Nanda et al (2019) and Chalkidis & Kampas (2019). I may have gotten the specific technical details wrong (although I hope not), but I think I got the functional insights.
Establishing relationships between legal sources
One of the problems that legal text analytics is trying to solve concerns establishing relationships between different legal sources—which can be a partial aspect of the need to ‘understand’ them (issue (i) above). This is the main problem discussed in Nanda et al, 'Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives' (2019) 27(2) Artificial Intelligence and Law 199-225. In this piece of research, AI is used to establish whether a provision of a national implementing measure (NIM) transposes a specific article of an EU Directive or not. In extremely simplified terms, the researchers train different algorithms to perform text comparison. The researchers work on a closed list of 43 EU Directives and the corresponding Luxembuorgian, Irish and Italian NIMs. The following table plots their results.
Nanda et al (2019: 208, Figure 6).
The table shows that the best AI solution developed by the researchers (the TF-IDF cosine) achieves levels of precision of around 83% for Luxembourg, 77% for Italy and 68% for Ireland. These seem like rather impressive results but a qualitative analysis of their experiment indicates that the significantly better performance for Luxembourgian transposition over Italian or Irish transposition likely results from the fact that Luxembourg tends to largely ‘copy & paste’ EU Directives into national law, whereas the Italian and Irish legislators adopt a more complex approach to the integration of EU rules into their existing legal instruments.
Moreover, it should be noted that the algorithms are working on a very specific issue, as they are only assessing the correspondence between provisions of EU and NIM instruments that were related—that is, they are operating in a closed or walled dataset that does not include NIMs that do not transpose any of the 43 chosen Directives. Once these aspects of the research design are taken into account, there are a number of unanswered questions, such as the precision that the algorithms would have if they had to compare entire NIMs against an open-ended list of EU Directives, or if they were used to screen for transposition rules. While the first issue could probably be answered simply extending the experiment, the second issue would probably require a different type of AI design.
On the whole, my impression after reading this interesting piece of research is that AI is still relatively far from a situation where it can provide reliable answers to the issue of establishing relationships across legal sources, particularly if one thinks of relatively more complex relationships than transposition within the EU context, such as development, modification or repeal of a given set of rules by other (potentially dispersed) rules.
Establishing relationships between legal problems and legal sources
A separate but related issue requires AI to identify legal sources that could be relevant to solve a specific legal problem (issue (ii) above)—that is, the relevant relationship is not across legal sources (as above), but between a legal problem or question and relevant legal sources.
This is covered in part of the literature review included in Chalkidis & Kampas, ‘Deep learning in law: early adaptation and legal word embeddings trained on large corpora‘ (2019) 27(2) Artificial Intelligence and Law 171-198 (see esp 188-194), where they discuss some of the solutions given to the task of the Competition on Legal Information Extraction/Entailment (COLIEE) from 2014 to 2017, which focused ‘on two aspects related to a binary (yes/no) question answering as follows: Phase one of the legal question answering task involves reading a question Q and extract[ing] the legal articles of the Civil Code that are relevant to the question. In phase two the systems should return a yes or no answer if the retrieved articles from phase one entail or not the question Q’.
The paper covers four different attempts at solving the task. It reports that the AI solutions developed to address the two binary questions achieved the following levels of precision: 66.67% (Morimoto et al. (2017)); 63.87% (Kim et al. (2015)); 57.6% (Do et al. (2017)); 53.8% (Nanda et al. (2017)). Once again, these results are rather impressive but some contextualisation may help to assess the extent to which this can be useful in legal practice.
The best AI solution was able to identify relevant provisions that entailed the relevant question 2 out of 3 times. However, the algorithms were once again working on a closed or walled field because they solely had to search for relevant provisions in the Civil Code. One can thus wonder whether algorithms confronted with the entirety of a legal order would be able to reach even close degrees of accuracy.
Based on the current state of legal text analytics (as far as I can see it), it seems clear that AI is far from being able to perform independent/unsupervised legal analysis and provide automated solutions to legal problems (issue (iii) above) because there are still very significant shortcomings concerning issues of ‘understanding’ natural language legal texts (issue (i)) and adequately relating them to specific legal problems (issue (ii)). That should not be surprising.
However, what also seems clear is that AI is very far from being able to confront the vastness of a legal order and that, much as lawyers themselves, AI tools need to specialise and operate within the narrower boundaries of sub-domains or quite contained legal fields. When that is the case, AI can achieve much higher degrees of precision—see examples of information extraction precision above 90% in Chalkidis & Kampas (2019: 194-196) in projects concerning Chinese credit fraud judgments and Canadian immigration rules.
Therefore, the current state of legal text analytics seems to indicate that AI is (quickly?) reaching a point where algorithms can be used to extract legal information from natural language text sources within a specified legal field (which needs to be established through adequate supervision) in a way that allows it to provide fallible or incomplete lists of potentially relevant rules or materials for a given legal issue. However, this still requires legal experts to complement the relevant searches (to bridge any gaps) and to screen the proposed materials for actual relevance. In that regard, AI does hold the promise of much better results than previous expert systems and information retrieval systems and, where adequately trained, it can support and potentially improve legal research (ie cognitive computing, along the lines developed by Ashley (2017)). However, in my view, there are extremely limited prospects for ‘independent functionality’ of legaltech solutions. I would happily hear arguments to the contrary, though!
Despite growing global interest in the use of algorithmic behavioural screens, big data and machine learning to detect bid rigging in procurement markets, the UK’s Competition and Markets Authority (CMA) was under no obligation to undertake a project in this area, much less to publish a bid-rigging algorithmic screening tool and make it generally available. Yet, in 2017 and under self-imposed pressure, the CMA released ‘Screening for Cartels’ (SfC) as ‘a tool to help procurers screen their tender data for signs of illegal bid-rigging activity’ and has since been trying to raise its profile internationally. There is thus a possibility that the SfC tool is not only used by UK public buyers, but also disseminated and replicated in other jurisdictions seeking to implement ‘tried and tested’ solutions to screen for cartels. This paper argues that such a legal transplant would be undesirable.
In order to substantiate this main claim, and after critically assessing the tool, the paper tracks the origins of the indicators included in the SfC tool to show that its functionality is rather limited as compared with alternative models that were put to the CMA. The paper engages with the SfC tool’s creation process to show how it is the result of poor policy-making based on the material dismissal of the recommendations of the consultants involved in its development, and that this has resulted in the mere illusion that big data and algorithmic screens are being used to detect bid rigging in the UK. The paper also shows that, as a result of the ‘distributed model’ used by the CMA, the algorithms underlying the SfC tool cannot improved through training, the publication of the SfC tool lowers the likelihood of some types of ‘easy to spot cases’ by signalling areas of ‘cartel sophistication’ that can bypass its tests and that, on the whole, the tool is simply not fit for purpose. This situation is detrimental to the public interest because reliance in a defective screening tool can create a false perception of competition for public contracts, and because it leads to immobilism that delays (or prevents) a much-needed engagement with the extant difficulties in developing a suitable algorithmic screen based on proper big data analytics. The paper concludes that competition or procurement authorities willing to adopt the SfC tool would be buying fool’s gold and that the CMA was wrong to cheat at solitaire to expedite the deployment of a faulty tool.
The full citation of the paper is: Sanchez-Graells, Albert, ‘Screening for Cartels’ in Public Procurement: Cheating at Solitaire to Sell Fool’s Gold? (May 3, 2019). Available at SSRN: https://ssrn.com/abstract=3382270
I was really proud to see that the University of Bristol declared a climate emergency. It was one of those moments that makes you feel part of a worthwhile institution (despite its many other flaws, like all institutions). Inspired by the exploding #Fridaysforclimate movement and the speeches of brave activist @GretaThunberg, I had been thinking about what I could personally do to contribute to the needed paradigm change. It did not take much reflection to realise that the most effective change in my professional life would clearly be to cut down travel, specially by air. And so, the University’s announcement prompted me to ‘go public’ with it.
This tweet prompted a series of exchanges with colleagues from Bristol and elsewhere. The reaction was mainly in three directions. First, that such a personal ‘no travel policy’ may be impossible to adopt in the context of (UK) academia, where public and conference speaking is used as both a measure of ‘academic productivity’ and as a proxy for esteem/standing in the field for the purposes of eg promotion—so, either you travel, or you may be seen as not doing your job or/and not worthy of (further) promotion. Second, that this would reduce the likely impact of my research and cut me off from potentially relevant audiences. Third, that this would exclude some of the very enjoyable moments that come with academic conferences, where you end up socialising with likely-minded colleagues and developing networks of collaborators and, if lucky, friends.
All of these are important points, so I have given this a little bit more thought.
First, I have to concede that not traveling to conferences will be an issue in terms of justifying my engagement with the academic (and policy-making) communities unless I manage to find a way to still participate in conferences. But this should not be too difficult. Today, there is large number of options to organise webinars and to allow for remote participation in meetings, so there is really no excuse not to take advantage of them. The technology is there and most institutions offer the required equipment and software, so it is high time that academics (and policy-makers) start using it as the default way of organising our interactions. This can even have secondary positive effects, such as the possibility of recording and publishing all or part of the conferences/meetings, so that different people can engage with the discussion at different times.
I also concede that not traveling to conferences and workshops can have a negative impact on ‘CV-building’ and that this will reduce any academic’s prospect of promotion. But I can only say that, to my shame and regret, I have been burning too much CO2 to get to my current academic position. In current lingo, I have exhausted (or, more likely, exceeded) my CO2 budget for conferences, so I can no longer afford to do it. If this means that my employer may not consider me deserving of a higher academic position as they may otherwise have, then I will have to accept any delays that come from implementing a no travel policy. In the grand scheme of things, this is a tiny sacrifice.
I acknowledge that this is something I can do from the very privileged academic position I am lucky to have, so I have no intention of proselytising. However, I do plan to try to change the system. I will work with my local trade union branch to see if we can make specific proposals to reduce the CO2 footprint of the promotions procedure. I will also organise webinars and non-presential conferences and offer every opportunity I can, in particular to early career researchers, so that academics can carry on with ‘CV-building’ (and, more importantly, knowledge-exchange) despite not traveling. These are the remedial actions I can and will implement. If you can think of others, please let me know. I would be more than happy to chip in.
Second, I must say that I have generally reached the audience for my academic work online. Only very rarely have I spoken at a conference or workshop where participants did not know my work from my SSRN page and this blog. With the partial exception of Brussels-based policy-makers (when I have been member of expert groups), every other policy-making body and NGO that has engaged with my work has done so remotely and, oftentimes, without any sort of direct conversation or exchange. There are plenty opportunities for academics to share their work online on open access and this has made the need for last-century-type conferences and workshops largely redundant for the purposes of knowledge and research dissemination. We need to realise this and use it to the advantage of a lower CO2 footprint for knowledge exchange.
Third, the social component is more difficult to address. There is no question that socialising at conferences and workshops has value in and of itself. It is also clear that, once you establish a network, you do not need to meet regularly with your collaborators and friends (however nice it is) to keep it going. So this may be the only aspect of conference travel that could justify going to a very specific event eg to establish new connections or to rekindle/deepen existing ones. But maybe this can be done without flying—eg in the case of UK-based academics like me, to prioritise conferences in Europe and convincing our employers and ourselves to take the extra time to travel by train or bus (anecdotally, most academics I know love train trips).
So, all in all, I have reaffirmed myself in the commitment to minimise my conference travel and, from today, I plan to not accept invitations to speak at or attend any conferences that require me to fly (although I will still fulfill the few prior commitments that I have). I will always ask for a ‘virtual alternative’, though, and I am really hoping that this will be acceptable (or even welcome).
Thus, in case you organise a conference on a topic within my expertise, here is my message: I will not fly to your conference, but I hope you will still invite me to participate. I hope you will because we have the technology to do this and because I value of our exchanges.
The UK Cabinet Office is currently consulting on its draft policy on ‘Social Value in Government Contracts’ and will be receiving submissions until 10 June 2019. Below is my contribution to the public consultation, which will probably make more sense if read after the consultation paper. Comments and feedback most welcome.
The EUI Robert Schuman Centre for Advanced Studies’ working papers series has two interesting recent additions on the economic analysis of procurement regulation and its effects on competition, efficiency and value for money. Both papers are by BKO Tas.
The first paper: ‘Bunching Below Thresholds to Manipulate Public Procurement’ explores the effects of a contracting authority’s ‘bunching strategy’ to seek to exercise more discretion by artificially estimating the value of future contracts just below the thresholds that would trigger compliance with EU procurement rules. This paper is relevant to the broader discussion on the usefulness and adequacy of current EU (and WTO GPA) value thresholds (see eg the work of Telles, here and here), as well as on the regulatory decisions that EU Member States face on whether to extend the EU rules to ‘below-threshold’ contracts.
The second paper: ‘Effect of Public Procurement Regulation on Competition and Cost-Effectiveness’ uses the World Bank’s ‘Benchmarking Public Procurement’ quality scores to empirically test the positive effects of improved regulation quality on competition and value for money, measured as increases in the number of bidders and the probability that procurement price is lower than estimated cost. This paper is relevant in the context of recent discussions about the usefulness or not of procurement benchmarks, and regarding the increasing concern about reduced number of bids in EU-regulated public tenders.
In this blog post, I reflect on the methodology and insights of both papers, paying particular attention to the fact that both papers build on datasets and/or indexes (TED, the WB benchmark) that I find rather imperfect and unsuitable for this type of analysis (regarding TED, in the context of the Single Market Scoreboard for Public Procurement (SMPP) that builds upon it, see here; regarding the WB benchmark, see here). Therefore, not all criticisms below are to the papers themselves, but rather to the distortions that skewed, incomplete or misleading data and indicators can have on more refined analysis that builds upon them.
Bunching Below Thresholds to Manipulate Procurement (Tas: 2019a)
It is well-known that the EU procurement rules are based on a series of jurisdictional triggers and that one of them concerns value thresholds—currently regulated in Arts 4 & 5 of Directive 2014/24/EU. Contracts with an estimated value above those thresholds are subjected to the entire EU procurement regulation, whereas contracts of a lower value are solely subjected to principles-based requirements where they are of ‘cross-border interest’. Given the obvious temptation/interest in keeping procurement shielded from EU requirements, the EU Directives have included an anti-circumvention rule aimed at preventing Member States from artificially splitting contracts in order to keep their award below the relevant jurisdictional thresholds (Art 5(3) Dir 2014/24). This rule has been interpreted expansively by the Court of Justice of the European Union (see eg here).
‘Bunching Below Thresholds to Manipulate Public Procurement’ examines the effects of a practice that would likely infringe the anti-circumvention rule, as it assesses a strategy of ‘bunching estimated costs just below thresholds’ ‘to exercise more discretion in public procurement’. The paper develops a methodology to identify contracting authorities ‘that have higher probabilities of bunching estimated values below EU thresholds’ (ie manipulative authorities) and finds that ‘[m]anipulative authorities have significantly lower probabilities of employing competitive procurement procedure. The bunching manipulation scheme significantly diminishes cost-effectiveness of public procurement. On average, prices of below threshold contracts are 18-28% higher when the authority has an elevated probability of bunching.’ These are quite striking (but perhaps not surprising) results.
The paper employs a regression discontinuity approach to determine the likelihood of bunching. In order to do that, the paper relies on the TED database. The paper is certainly difficult to read and hardly intelligible for a lawyer, but there are some issues that raise important questions. One concerns the authors’ (mis)understanding of how the WTO GPA and the EU procurement rules operate, in particular when the paper states that ‘Contracts covered by the WTO GPA are subject to additional scrutiny by international organizations and authorities (sic). Accordingly, contracts covered by the WTO GPA are less likely to be manipulated by EU authorities’ (p. 12). This is simply an acritical transplant of considerations made by the authors of a paper that examined procurement in the Czech Republic, where the relevant threshold between EU covered and non-EU covered procurement would make sense. Here, the distinction between WTO GPA and EU-covered procurement simply makes no sense, given that WTO GPA and EU thresholds are coordinated. This alone raises some issues concerning the tests designed by the author to check the robustness of the hypothesis that bunching leads to inefficiency in procurement expenditure.
Another issue concerns the way in which the author equates open procedures to a ‘first price auction mechanism’ (which they are not exactly) and dismisses other procedures (notably, the restricted procedure) as incapable of ensuring value for money or, more likely, as representative of a higher degree of discretion for the contracting authority—which is a highly questionable assumption.
More importantly, I am not sure that the author understood what is in the TED database and, crucially, what is not there (see section 2 of Tas (2019a) for methodology and data description). Albeit not very clearly, the author presents TED as a comprehensive database of procurement notices—ie, as if 100% of procurement expenditure by Member States was recorded there. However, in the specific context of bunching below thresholds, the TED database is very likely to be incomplete.
Contracting authorities tendering contracts below EU thresholds are under no obligation to publish a contract notice (Art 49 Dir 2014/24). They could publish voluntarily, in particular in the form of a voluntary ex ante transparency (VEAT) notice, but that would make no sense from the perspective of a contracting authority that seeks to avoid compliance with EU rules by bunching (ie manipulating) the estimated contract value, as that would expose it to potential litigation. Most authorities that are bunching their procurement needs (or, in simple terms) avoiding compliance with the EU rules will not be reflected in the TED database at all, or will not be identified by the methodology used by Tas (2019a), as they will not have filed any notices for contracts below thresholds.
How is it possible that TED includes notices regarding contracts below the EU thresholds, then? Well, this is anybody’s guess, but mine is that a large proportion of those notices will be linked to either countries with a tradition of full transparency (over-reporting), to contracts where there are any doubts about the potential cross-border interest (sometimes assessed over-cautiously), or will be notices with mistakes, where the estimated value of the contract is erroneously indicated as below thresholds.
Even if my guess was incorrect and all notices for contracts with a value below thresholds were accurate and justified by the existence of a potential cross-border interest, the database cannot be considered complete. One of the issues raised (imperfectly) by the Single Market Scoreboard (indicator  publication rate) is the relatively low level of procurement that is advertised in TED compared to the (putative/presumptive) total volume of procurement expenditure by the Member States. Without information on the conditions of the vast majority of contract awards (below thresholds, unreported, etc), any analysis of potential losses of competitiveness / efficiency in public expenditure (due to bunching or otherwise) is bound to be misleading.
Moreover, Tas (2019a) is premised on the hypothesis that procurement below EU thresholds allows for significantly more discretion than procurement above those thresholds. However, this hypothesis fails to recognise the variety of transposition strategies at Member State level. While some countries have opted for less stringent below EU threshold regimes, others have extended the EU rules to the entirety of their procurement (or, perhaps, to contracts up to and including much lower values than the EU thresholds, to the exception of some class of ‘micropurchases’). This would require the introduction of a control that could refine Tas’ analysis and distinguish those cases of bunching that do lead to more discretion and those that do not (at least formally)—which could perhaps distinguish between price effects derived from national-only transparency from those of more legally-dubious maneuvering.
In my view, regardless of the methodology and the math underpinning the paper (which I am in no position to assess in detail), once these data issues are taken into account, the story the paper tries to tell breaks down and there are important shortcomings in its empirical strategy that, in my view, raise significant issues around the strength of its findings—assessed not against the information in TED, but against the (largely unknown, unrecorded) reality of procurement in the EU.
I have no doubt that there is bunching in practice, and that the intuition that it raises procurement costs must be right, but I have serious doubts about the possibility to reliably identify bunching or estimate its effects on the basis of the information in TED, as most culprits will not be included and the effects of below threshold (national) competition only will mostly not be accounted for.
It is also a very intuitive hypothesis that better regulation should lead to better procurement outcomes and, consequently, that more open and robust procurement rules should lead to more efficiency in the expenditure of public funds. As mentioned above, Tas (2019b) explores this hypothesis and seeks to empirically test it using the TED database and the World Bank’s Benchmarking Public Procurement (in its 2017 iteration, see here). I will not repeat my misgivings about the use of the TED database as a reliable source of information. In this second part, I will solely comment on the use of the WB’s benchmark.
The paper relies on four of the WB’s benchmark indicators (one further constructed by Djankov et al (2017)): the ‘bid preparation score, bid and contract management score, payment of suppliers score and PP overall index’. The paper includes a useful table with these values (see Tas (2019b: Table 4)), which allows the author to rank the countries according to the quality of their procurement regulation. The findings of Tas (2019b) are thus entirely dependent on the quality of the WB’s benchmark and its ability to capture (and distinguish) good procurement regulation.
In order to test the extent to which the WB’s benchmark is a good input for this sort of analysis, I have compared it to the indicator that results from the European Commission’s Single Market Scoreboard for Public Procurement (SMSPP, in its 2018 iteration). The comparison is rather striking …
Source: own elaboration.
Clearly, both sets of indicators are based on different methodologies and measure relatively different things. However, they are both intended to express relevant regulators’ views on what constitutes ‘good procurement regulation’. In my view, both of them fail to do so for reasons already given (see here and here).
The implications for work such as Tas (2019b) is that the reliability of the findings—regardless of the math underpinning them—is as weak as the indicators they are based on. Likely, plugging the same methods to the SMSPP instead of the WB’s index would yield very different results—perhaps, that countries with very low quality of procurement regulation (as per the SMSPP index) achieve better economic results, which would not be a popular story with policy-makers… and the results with either index would also be different if the algorithms were not fed by TED, but by a more comprehensive and reliable database.
So, the most that can be said is that attempts to empirically show effects of good (or poor) procurement regulation remain doomed to fail or , in perhaps less harsh terms, doomed to tell a story based on a very skewed, narrow and anecdotal understanding of procurement and an incomplete recording of procurement activity. Believe those stories at your own peril…
There is a growing interest in the use of big data to improve public procurement performance and to strengthen procurement governance. This is a worthy endeavour and, like many others, I am concentrating my research efforts in this area. I have not been doing this for too long. However, soon after one starts researching the topic, a preliminary conclusion clearly emerges: without good data, there is not much that can be done. No data, no fun. So far so good.
It is thus a little discouraging to confirm that, as is widely accepted, there is no good data architecture underpinning public procurement practice and policy in the EU (and elsewhere). Consequently, there is a rather limited prospect of any real implementation of big data-based solutions, unless and until there is a significant investment in the creation of a proper data foundation that can enable advanced analysis and policy-making. Adopting the Open Contracting Data Standard for the European Union would be a good place to start. We could then discuss to what extent the data needs to be fully open (hint: it should not be, see here and here), but let’s save that discussion for another day.
What a recent twitter threat has reminded me is that there is a bigger downside to the existence of poor data than being unable to apply advanced big data analytics: the formulation of procurement policy on the basis of poor data and poor(er) statistical analysis.
This reflection emerged on the basis of the 2018 iteration of the Single Market Scoreboard for Public Procurement (the SMSPP), which is the closest the European Commission is getting to data-driven policy analysis, as far as I can see. The SMSPP is still work in progress. As such, it requires some close scrutiny and, in my view, strong criticism. As I will develop in the rest of this post, the SMSPP is problematic not solely in the way it presents information—which is clearly laden by implicit policy judgements of the European Commission—but, more importantly, due to its inability to inform either cross-sectional (ie comparative) or time series (ie trend) analysis of public procurement policy in the single market. Before developing these criticisms, I will provide a short description of the SMSPP (as I understand it).
The Single Market Scoreboard for Public Procurement: what is it?
The European Commission has developed the broader Single Market Scoreboard (SMS) as an instrument to support its effort of monitoring compliance with internal market law. The Commission itself explains that the “scoreboard aims to give an overview of the practical management of the Single Market. The scoreboard covers all those areas of the Single Market where sufficient reliable data are available. Certain areas of the Single Market such as financial services, transport, energy, digital economy and others are closely monitored separately by the responsible Commission services“ (emphasis added). The SMS organises information in different ways, such as by stage in the governance cycle; by performance per Member State; by governance tool; by policy area or by state of trade integration and market openness (the latter two are still work in progress).
The SMS for public procurement (SMSPP) is an instance of SMS by policy area. It thus represents the Commission’s view that the SMSPP is (a) based on sufficiently reliable data, as it is fed from the database resulting from the mandatory publications of procurement notices in the Tenders Electronic Daily (TED), and (b) a useful tool to provide an overview of the functioning of the single market for public procurement or, in other words of the ‘performance’ of public procurement, defined as a measure of ‘whether purchasers get good value for money‘.
The SMSPP determines the overall performance of a given Member States by aggregating a number of indicators. Currently, the SMSPP is based on 12 indicators (it used to be based on a smaller number, as discussed below):  Single bidder;  No calls for bids;  Publication rate;  Cooperative procurement;  Award criteria;  Decision speed;  SME contractors;  SME bids;  Procedures divided into lots;  Missing calls for bids;  Missing seller registration numbers;  Missing buyer registration numbers. As the SMSPP explains, the addition of these indicators results in the measure of ‘overall performance’, which
is a sum of scores for all 12 individual indicators (by default, a satisfactory performance in an individual indicator increases the overall score by one point while an unsatisfactory performance reduces it by one point). The 3 most important are triple-weighted (Single bidder, No calls for bids and Publication rate). This is because they are linked with competition, transparency and market access–the core principles of good public procurement. Indicators 7-12 receive a one-third weighting. This is because they measure the same concepts from different perspectives: participation by small firms (indicators 7-9) and data quality (indicators 10-12).
The most recent snapshot of overall procurement performance is represented in the map below, which would indicate that procurement policy is rather disfunctional—as most EEA countries do not seem to be doing very well.
Source: European Commission, 2018 Single Market Scorecard for Public Procurement (based on 2017 data).
In my view, this use of the available information is very problematic: (a) to begin with, because the data in TED can hardly be considered ‘sufficiently reliable‘. The database in TED has problems of various sorts because it is a database that is constructed as a result of the self-declaration of data by the contracting authorities of the Member States, which makes its content very dishomogeneous and difficult to analyse, including significant problems of under-inclusiveness, definitional fuzziness and the lack of filtering of errors—as recognised, repeatedly, in the methodology underpinning the SMSPP itself. This should make one take the results of the SMSPP with more than a pinch of salt. However, these are not all the problems implicit in the SMSPP.
More importantly: (b) the definition of procurement performance and the ways in which the SMSPP seeks to assess it are far from universally accepted. They are rather judgement-laden and reflect the policy biases of the European Commission without making this sufficiently explicit. This issue requires further elaboration.
The SMSPP as an expression of policy-making: more than dubious judgements
I already criticised the Single Market Scoreboard for public procurement three years ago, mainly on the basis that some of the thresholds adopted by the European Commission to establish whether countries performed well or poorly in relation to a given indicator were not properly justified or backed by empirical evidence. Unfortunately, this remains the case and the Commission is yet to make a persuasive case for its decision that eg, in relation to indicator  Cooperative procurement, countries that aggregate 10% or more of their procurement achieve good procurement performance, while countries that aggregate less than 10% do not.
Similar issues arise with other indicators, such as  Publication rate, which measures the value of procurement advertised on TED as a proportion of national Gross Domestic Product (GDP). It is given threshold values of more than 5% for good performance and less than 2.5% for poor performance. The Commission considers that this indicator is useful because ‘A higher score is better, as it allows more companiesto bid, bringing better value for money. It also means greater transparency, as more information is available to the public.’ However, this is inconsistent with the fact that the SMSPP methodology stresses that it is affected by the ‘main shortcoming … that it does not reflect the different weight that government spending has in the economy of a particular’ Member State (p. 13). It also fails to account for different economic models where some Member States can retain a much larger in-house capability than others, as well as failing to reflect other issues such as fiscal policies, etc. Moreover, the SMSPP includes a note that says that ‘Due to delays in data availability, these results are based on 2015 data (also used in the 2016 scoreboard). However, given the slow changes to this indicator, 2015 results are still relevant.‘ I wonder how is it possible to establishes that there are ‘slow changes’ to the indicator where there is no more current information. On the whole, this is clearly an indicator that should be dropped, rather than included with such a phenomenal number of (partially hidden) caveats.
On the whole, then, the SMSPP and a number of the indicators on which it is based is reflective of the implicit policy biases of the European Commission. In my view, it is disingenuous to try to save this by simply stressing that the SMSPP and its indicators
Like all indicators, however, they simplify reality. They are affected by country-specific factors such as what is actually being bought, the structure of the economies concerned, and the relationships between different tendering options, none of which are taken into account. Also, some aspects of public procurement have been omitted entirely or covered only indirectly, e.g. corruption, the administrative burden and professionalism. So, although the Scoreboard provides useful information, it gives only a partial view of EU countries' public procurement performance.
I would rather argue that, in these conditions, the SMSPP is not really useful. In particular, because it fails to enable analysis that could offer some valuable insights even despite the shortcomings of the underlying indicators: first, a cross-sectional analysis by comparing different countries under a single indicator; second, a trend analysis of evolution of procurement “performance” in the single market and/or in a given country.
The SMSPP and cross-sectional analysis: not fit for purpose
This criticism is largely implicit in the previous discussion, as the creation of indicators that are not reflective of ‘country-specific factors such as what is actually being bought, the structure of the economies concerned, and the relationships between different tendering options’ by itself prevents meaningful comparisons across the single market. Moreover, a closer look at the SMSPP methodology reveals that there are further issues that make such cross-sectional analysis difficult. To continue the discussion concerning indicator  Cooperative procurement, it is remarkable that the SMSPP methodology indicates that
[In previous versions] the only information on cooperative procurement was a tick box indicating that "The contracting authority is purchasing on behalf of other contracting authorities". This was intended to mean procurement in one of two cases: "The contract is awarded by a central purchasing body" and "The contract involves joint procurement". This has been made explicit in the [current methodology], where these two options are listed instead of the option on joint procurement. However, as always, there are exceptions to how uniformly this definition has been accepted across the EU. Anecdotally, in Belgium, this field has been interpreted as meaning that the management of the procurement procedure has been outsource[d] (e.g. to a legal company) -which explains the high values of this indicator for Belgium.
In simple terms, what this means is that the data point for Belgium (and any other country?) should have been excluded from analysis. In contrast, the SMSPP presents Belgium as achieving a good performance under this indicator—which, in turn, skews the overall performance of the country (which is, by the way, one of the few achieving positive overall performance… perhaps due to these data issues?).
This should give us some pause before we decide to give any meaning to cross-country comparisons at all. Additionally, as discussed below, we cannot (simply) rely on year-on-year comparisons of the overall performance of any given country.
The SMSPP and time series analysis: not fit for purpose
Below is a comparison of the ‘overall performance’ maps published in the last five iterations of the SMSPP.
Source: own elaboration, based on the European Commission’s Single Market Scoreboard for Public Procurement for the years 2014-2018 (please note that this refers to publication years, whereas the data on which each of the reports is based corresponds to the previous year).
One would be tempted to read these maps as representing a time series and thus as allowing for trend analysis. However, that is not the case, for various reasons. First, the overall performance indicator has been constructed on the basis of different (sub)indicators in different iterations of the SMSPP:
the 2014 iteration was based on three indicators: bidder participation; accessibility and efficiency.
the 2015 SMSPP included six indicators: single bidder; no calls for bids; publication rate; cooperative procurement; award criteria and decision speed.
the 2016 SMSPP also included six indicators. However, compared to 2015, the 2016 SMSPP omitted ‘publication rate’ and instead added an indicator on ‘reporting problems’.
the 2017 SMSPP expanded to 9 indicators. Compared to 2016, the 2017 SMSPP reintroduced ‘publication rate’ and replaced ‘reporting problems’ for indicators on ‘missing values’, ‘missing calls for bids’ and ‘missing registration numbers’.
the 2018 SMSPP, as mentioned above, is based on 12 indicators. Compared to 2017, the 2018 SMSPP has added indicators on ‘SME contractors’, ‘SME bids’ and ‘procedures divided into lots’. It has also deleted the indicator ‘missing values’ and disaggregated the ‘missing registration numbers’ into ‘missing seller registration numbers’ and ‘missing buyer registration numbers’.
It is plain that there are no two consecutive iterations of the SMSPP based on comparable indicators. Moreover, the way that the overall performance is determined has also changed. While the SMSPP for 2014 to 2017 established the overall performance as a ‘deviation from the average’ of sorts, whereby countries were given ‘green’ for overall marks above 90% of the average mark, ‘yellow’ for overall marks between 80 and 90% of the average mark, and ‘red’ for marks below 80% of the average mark; in the 2018 SMSPP, ‘green’ indicates a score above 3, ‘yellow’ indicates a score below 3 and above -3, and ‘red’ indicates a score below -3. In other words, the colour coding for the maps has changed from a measure of relative performance to a measure of absolute performance—which, in fairness, could be more meaningful.
As a result of these (and, potentially, other) issues, the SMSPP is clearly unable to support trend analysis, either at single market or country level. However, despite the disclaimers in the published documents, this remains a risk (to the extent that anyone really engages with the SMSPP).
The example of the SMSPP does not augur very well for the adoption of data analytics-based policy-making. This is a case where, despite acknowledging shortcomings in the methodology and the data, the Commission has pressed on, seemingly on the premise that ‘some data (analysis) is better than none’. However, in my view, this is the wrong approach. To put it plainly, the SMSPP is rather useless. However, it may create the impression that procurement data is being used to design policy and support its implementation. It would be better for the Commission to stop publishing the SMSPP until the underlying data issues are corrected and the methodology is streamlined. Otherwise, the Commission is simply creating noise around data-based analysis of procurement policy, and this can only erode its reputation as a policy-making body and the guardian of the single market.