Unequal educational outcomes are a key concern for the Government – relating to wider issues with the UK economy, social mobility and social justice. At the UBDC we are examining the drivers of place-based educational inequalities in order to inform educational and urban policy. In this blog, UBDC Research Associate Dr Phil Mason explains how linked big data can provide clarity on the surprisingly complex causes of these disadvantages.
It is often tempting to believe that, by and large, we understand the causes of many of society’s problems and that it is a short step from understanding to developing effective policies to change things for the better.
For instance, it comes as no surprise to find that if you come from a deprived neighbourhood - all other things being equal - your chances of educational success are less than those of someone from a more flourishing area. This is just common sense, after all. Or is it?
Nevertheless, such broad-brush research findings mask more complex, nuanced relationships arising from subtler, more specific causes. Until recently, limitations with the existence and availability of data have prevented these aspects from being examined in the detail we would wish to.
In the age of big data, we can now look at things in deeper and wider ways: deeper, by examining information at the finest of scales —the social-scientific equivalent of physics’ fundamental particles of matter— rather than at some level of aggregation; wider, by investigating how things change over time. Added to that, the sheer quantity of data means we can analyse many more putatively influential factors simultaneously and still be able to draw statistically reliable conclusions with confidence.
In our case, as part of the UBDC’s Educational Disadvantage and Place Project, we are using ScotXed’s secondary school data to examine in the greatest possible depth the associations between place —the pupil’s residential neighbourhood and their school— and educational outcomes, by subject, of all S4, S5 and S6 pupils studying in the Greater Glasgow local authority areas (Glasgow City, East and West Dunbartonshire, North and South Lanarkshire, Renfrewshire, East Renfrewshire, and Inverclyde). We have augmented this dataset by linking area-based measures of deprivation (SIMD vigintiles) and urban-rural location, and by calculating the distance as the crow flies between pupils’ homes and their schools. The breadth of the research arises from being able to analyse these characteristics for each school year from 2006-7 to 2014-5, so we can look at period trends and cohort effects.
The associations of pupils’ personal characteristics, their backgrounds (tagged locational data, area deprivation, urban versus rural location, distance travelled to school, etc.) and school characteristics (subject availability and choice, staffing levels etc.) with educational outcomes and subsequent educational and employment destinations have been investigated before, to varying extents. However, we have the opportunity to examine them simultaneously and over time, using powerful multivariate, multilevel statistical modelling methods.
Even these complex analyses of educational outcomes cover only a part of students’ overall educational careers and their entry (or not) into employment. Adopting a similar approach, we will extend this work using individual student-level datasets from SFC and HESA to look at the links between place and inequalities in Further and Higher Education outcomes.
We also hope to link the educational data with other datasets related to the availability of public transport, as it has been many years since the influence of available transport on participation in education in Scotland has been studied in the depth it deserves.
In this way, we hope to gain a clearer, combined perspective on the drivers of educational disadvantage throughout Secondary, Further and Higher Education and entry into employment.
By looking deeper and wider, we are sure to be surprised by many of our results: not everything turns out to be what common sense might lead us to expect.
Every year, the Urban Big Data Centre eagerly grasps the opportunity to take part in the ESRC Festival of Social Science. We're always keen to get all kinds of people - such as data enthusiasts, local activists, urbanists and others - involved in the work we do here, to keep our research on big data for urban improvement relevant.
This year we ran an Active Travel Data Challenge, where we asked folks with skills and an interest to use the Strava Metro app data we provide to make tools to encourage more cycling. And we hosted an Active Travel Data Demo Day in November 2017 to show off the challenge finalists' entries, alongside a range of other interesting active travel data developments. We also hoped to get some useful insights from participants that we could pass on to cycling groups and urban planners to enhance their future developments. This blog is about the demo day, and the ideas we gathered are further down the page. You can also see Tweets about the day from a range of participants by searching the hashtag #ubdcdc.
UBDC Active Travel Data Challenge
The UBDC offers a range of data drawn from the Strava app, widely used by cyclists and other athletes around the world. These data collections are known as Strava Metro, and we have an arrangement with Strava to make this data freely available for use in research of all kinds within the UK. For more information on how you can get hold of it, and what work has already been done with it, visit our Transport Data Collection page.
For the ESRC Festival of Social Science this year, we decided to run a data challenge, where we asked people to use Strava Metro data to develop apps, tools, linked datasets or anything else they could think of to promote cycling. We said that tools developed could be of direct use to cyclists and communities, or they could help from the perspective of enabling transport planners and the like to make cycling easier for citizens.
Active Travel Data Demo Day
We had two excellent finalists out of this Challenge (see below for more), and we invited them to present their work for feedback, discussion, and, potentially, offers of further development of their work from the audience. This became the Active Travel Data Demo Day, held at Glasgow's Tontine Building in the Trongate, on the 9th November 2017.
Strava is a company with a strong commitment to using their app data for the public good, and they encourage the user communities that flourish around the use of their app to appreciate how sharing their ride (or run or walk) data can help planners the world over make sustainable and active travel better. Of course, it's also a business decision to keep the profile of their app to the forefront, but they do seem to have a genuine sense of corporate social responsibility.
Because of this ethos, we love working with Strava. We were fortunate to have Haynes Bunn from Strava's offices in the US visit for the Demo Day, despite the fact that she was walking (in a very fit manner) with a stookie (Scots for "cast") on her lower leg! While the remainder of the day was about local uses of Strava and other transport data, she treated us to some inspiring stories of other Strava use around the world. Download the presentation (PDF 12MB)
Data Challenge Finalists
Our Data Challenge finalists were awarded their certificates by representatives of the judging panel: Ken MacDonald of BBC Scotland, and Kirsty Grainger of ESRC. The audience received these ideas enthusiastically, and a number of positive suggestions for onward development into real live apps were given. The finalists were Bill Oates and Iain Paton.
Bill Oates: Strive City
Bill Oates' tool uses Strava Metro data along with our sister Centre the CDRC's area classification scheme. Its aim is to encourage individuals and local areas to improve their cycling activity, mapped against similar activity in other places. It could also assist with strategic planning and evaluation of new infrastructure and policy interventions.
Feedback from participants included:
They liked the concept of encouraging a competitive community spirit to get people cycling and engaging with their neighbours more.
There was a suggestion to think about incentivising schools to take part, offering awards, e.g. free bike tune-ups.
Bill would like to engage with Strava to re-do the output areas and discuss how to update the tool more frequently.
Iain Paton: Visualisation of Strava cycle usage data against collision data and existing infrastructure in Glasgow
Iain Paton's tool uses Strava Metro data along with STATS19 road incident data, cycle infrastructure data from OSM, and Council Ward boundary data for Glasgow, to provide cyclists with a way to immediately report incidents and barriers to cycling to their local Councillors.
They liked the idea of being able to directly give 1-2-1 feedback to their City Councillors. Suggested in addition:
Other representatives to be an option, e.g. (in Scotland) MSPs, in some cases relevant Council departments (e.g. cleansing).
A 1-click system would encourage use – autogenerating an email to the selected person or people.
Generating templates and alerts would also be good.
Using the app to take a photo and send as part of the 1 or 2 click process.
Being able to see outcomes or what others are saying.
Iain would like to make it more scalable and available on a range of platforms.
Big Data in the Service of Cycling: Lightning Presentations
We were also treated to a number of lightning talks from a range of academic, activist and third sector presenters. With only 3-4 minutes each they whizzed through the highlights of their work in a dizzying array of exciting cutting-edge ideas:
Mark showed us some academic research currently in progress on using Strava Data to evaluate the impact of new cycling infrastructure in Glasgow on cycling behaviour in the wider network. Once this work is completed, we'll ask Mark to do a new blog post about it, so watch this space!
Yeran presented his research into using Strava app and social media data to inform transport planning. He also recently wrote a fuller blog post on this work.
Dr Andrew Kirkland (Stirling University)
Andrew had a really different presentation for us on his research into overcoming psychological barriers to exercise. His work is grounded in supporting elite athletes, but he talked about how he is now working on applying insights from that work to help people cycle.
Francesca gave us two presentations: one on school-based active travel interventions, and one on gender and active travel. The latter generated a lot of discussion in the break-out groups and final summary session (see below).
Gathering People's Thoughts
After such an intense day with so many ideas shared, we took some time out to break into groups and discuss our thoughts. People's feedback on the Challenge Finalists' apps are recorded above, but we also gathered some great ideas for further academic research and for planners and policymakers.
Ideas sparked by the gender and cycling presentation by Francesca Hogg:
More research is needed in this area.
It needs to be broken down into different sub-categories of women – age, employment, whether they have to look after small children, etc.
There was a suggestion to take note of how the Dutch cycling infrastructure came about: through the activism of women concerned about the safety of children. How to learn from and build on that.
It should be extended to include other marginalised groups, e.g. for ethnicity, sexual orientation, etc.
Identifying safest routes could be one way of encouraging more cycling.
The ethics of gathering info via apps and tracking people's movements need to be clarified from a safety perspective (both perceived and actual risks).
Big data research must be supported by qualitative research to identify specific concerns of different types of cyclists.
More linking with environmental datasets was suggested as a productive area.
Identifying ideal routes in and out of cities could also involve the use of "beauty indexes" and sensory experience research.
Demand: predicting future demand and constraints with solid research would be useful.
Identification of "have to cycle" routes – where there is no choice but to go that way, how to know what those routes are and how to ensure they are suitable / improved.
The economic impact of cycling could be further researched.
It was noted that Strava gives a good macro-level view, but it would be useful to understand what generates this view in a more detailed way – e.g. microsimulation models.
More research needed on how Strava data compares to other cycling data.
Finally, for the real transport research geeks: the ontological questions – what constitutes a "trip" or a "journey" or a "commute"? Shared definitions of these will greatly enhance research and planning built on data.
Of course, all of these research ideas can feed into planning and policy ideas:
There should be some focus on areas where cycling doesn't happen. For instance, this could help with placement of cycle stations.
Roundabouts are clearly a high-risk area for cyclists – this needs to feed into roundabout planning and design.
It might be helpful to work on bringing together the car and cycling communities, which are currently somewhat at odds with each other.
More safe storage for bikes is needed, as bicycle theft is a big issue for cyclists.
Community schemes for cycling together could be introduced to support women and others who may feel unsafe cycling alone.
The "have to cycle" routes noted above under Academic Research need to be investigated and made safer. Some cycle routes in Glasgow are "bad"!
For shared-use walk/cycle routes, gathering and analysing data on pedestrian use and speed of cyclists there could address or allay concerns of pedestrians.
There is a need to track and analyse what happens after an intervention.
Route planning and mapping should be developed tailored to specific needs and types of users, as well as specific types of journey (e.g. commutes- fast, direct, safe from rush-hour traffic; leisure rides in cities – beautiful, safe for families, clean air, etc.).
As promised, we'll be sending this collation of ideas to all the groups involved in the day as well as other planners, policymakers and academics in our networks: please feel free to do the same and report back any news or updates!
The UBDC and other ESRC Festival of Social Science events
The Urban Big Data Centre is seeking an excellent individual to lead in business development and income generation for the Centre. The post holder will also be responsible for managing the Centre’s finances.
The ideal candidate will bring a great degree of experience in research funding, commercialisation of academic outputs, and financial management, and be prepared to represent UBDC in external partnerships. The position will require liaising with staff within University of Glasgow and partner universities for financial purposes including procurement, and maintaining the appropriate networks for the successful conduct of these activities as well to those relating to new bid preparation and other income generation. A requirement for the position is experience with business development, and the ability to work with a range of stakeholders and experience with industry and stakeholder relationship management.
The Citizens at the City’s Heart project (Catch! for short) was an exciting idea born out of a desire to generate better data for use in transport planning, and ultimately to make moving around cities easier.
It has now developed into an innovative project – funded by Innovate UK, led by Travelai and involving the Urban Big Data Centre, the Consumer Data Research Centre, Transport Systems Catapult, transportAPI, elgin, Coventry City Council, Ipswich Borough Council, Oxfordshire County Council, Leeds City Council, Newcastle City Council and the behaviouralist.
In this blog, Dr David McArthur explains the contribution that UBDC researchers are making to this collaborative project and how you can get involved.
New opportunities for tracking our travels
Traditionally, we have relied on surveying people to find out where they are travelling to, how they get there and what they do when they get to their destination. Gathering data in this way is expensive and relies on those surveyed accurately recalling all the details of their journey. This makes it harder for planners to come up with smart solutions.
Big data is offering new opportunities in tracking people’s movements. Mode-specific datasets such as cycling apps like Strava give us great data about what cyclists do, but don’t tell us much about what they do when they get off their bikes. Data gathered from mobile phone signals give us information about where people are travelling from and to, but don’t tell us much about how they get there or what routes they take. Data from sensors, such as traffic counters, tell us how many people pass a particular point at a particular time, but not much else.
Using an app to crowdsource better data
The Catch! project aims to improve the data gathering process by allowing people to collect their own data and make it available to their local authorities, in order to improve transport and planning in their area. The crowdsourced data is collected via an app that, with user consent, tracks the location of the user and what mode of transport they are using – all while providing a very useful journey planner! The app uses GPS signals, as well as some clever machine learning algorithms that use the phone’s sensors to guess what mode of transport is being used. This algorithm makes use of the way that our mobile phones are moved around differently and are travelling at different speeds when we walk compared to when we’re in a car or on a train.
Making sense of the data and ensuring privacy is protected
Having millions of GPS points isn’t particularly useful for planners. What they need is more aggregate information about who is travelling, their destination, the mode of transport used, and the purpose of their trip. In addition, the raw GPS data that is collected is highly sensitive as it reveals information about the people volunteering their data.
This is where the UBDC’s expertise comes in. Our role in the project has been focussed on making sense of the GPS points and ensuring people’s identities are protected. We have experience in this field gained from our work with the GPS data collected as part of the integrated Multimedia City Data (iMCD) project. For Catch!, this involved:
Using a technique called ‘map matching’ to work out what transport infrastructure (e.g. roads) people were using
Working out what locations were visited using stop detection and semantic annotation
Anonymising the data using a variety of techniques including grid masking and blurring.
The resulting data is not only in a useful format for planners but also in a format that protects the identity of the app user.
How you can get involved
The beta version of the app can be downloaded now from the App Store and Google Play and allows you to plan routes anywhere in the UK, see nearby travel options and save your favourite transport hubs and locations. Please download the app if you are interested in helping us to build a detailed picture of how your community travels that can be used by local authorities and transport providers to improve systems and services for you.
With the deadline for the project extended and work continuing, look out for more project updates on this blog!
The Urban Big Data Centre is seeking an excellent researcher with expert knowledge of transport planning to fill its current Research Associate in Transport Modelling and Informatics vacancy.
The work of the transport research group is central to the UBDC portfolio. This position will help develop the centre’s research agenda on travel behaviour and transport operations using novel sources of data, and the links between these research areas to broader urban themes.
Specifically, the job requires expert knowledge in the area of transport planning and operations problems, methods, and the policy agenda surrounding these issues. The position will provide the opportunity to learn about new forms of data and the methods needed for information retrieval and analytics. The post-holder will also be required to work with an interdisciplinary group of transport researchers in urban studies and engineering.
The successful candidate will be expected to produce high-impact research publications demonstrating the use of such novel data, represent UBDC in scholarly as well as public engagement events, and to contribute to the building of innovative data products supporting UBDC’s data infrastructure and to the centre’s training and capacity-building programme.
The deadline for applications is 10 December 2017.
How many landlords are there in Scotland? You'd be forgiven for assuming this is a simple enquiry with a simple response. However, despite its growth, the private rented sector remains difficult to study. In this blog, UBDC Associate Director Nick Bailey explores the diverse datasets that could be used to attempt to answer this question and explains the public value of getting a credible answer.
The private rented sector continues to grow apace in Scotland – reaching 15 per cent of all housing in the latest Scottish Household Survey estimates, a trebling in size since 1999. It remains a sector dominated by a large number of landlords who own a very small number of properties – often just one or two.
The large number of landlords makes the task of raising standards in the sector that much harder. With many involved on only a part-time basis, it is difficult to ensure that all of them understand their legal obligations. Standards are not necessarily worse with small landlords but they are likely to be more variable.
It can also add to the churning of properties and hence the insecurity of the tenure as small landlords are more likely to be short-term investors. That’s why governments in Scotland and the UK have been keen to encourage more institutional investment.
So how many people in Scotland are landlords?
Estimate from surveys
One approach is to piece together a picture from surveys and other research. The details are shown in the table below but, in summary, these suggest there were around 262,000 private rented dwellings owned by individual investors in 2015/16. Allowing for the fact that some people own more than one property while some properties have more than one owner, the number of individual landlords would be about 223,000. That’s about 1-in-20 of the adult population.
Ask the tax office
A second approach is to ask the tax office, HMRC, how many people declare an income from renting private dwellings on their tax returns. This is unlikely to be a complete count of the number of landlords. HMRC won’t know about some because they have no taxable income to declare after allowing for rent lost during void periods and legitimate expenses. And they won’t know about others who have taxable income but fail to declare it (tax evasion). But it is at least a minimum estimate.
In response to a recent Parliamentary Question, HMRC figures showed some 143,000 individuals living in Scotland declaring income from any kind of property in 2015/16(1). Even assuming that all of these own private rented accommodation (rather than, say, renting out shops or business premises), this is about one-third less than the estimate from surveys and suggests something like 80,000 people may be renting private property but not declaring any income from it for tax purposes.
As I noted, there are quite legitimate reasons why some - or indeed many - of these people would not be on the HMRC’s list. Nevertheless, there are good grounds here for HMRC to make more pro-active efforts to identify the non-payers.
Look at the Landlord Register
A third approach is to look at the number of registered landlords. The estimate from this source might understate the true number to some extent because some landlords fail to register. On the other hand, it might overstate it because registrations continue for three years, regardless of whether a property continues to be let. Even so, it is probably our most reliable source at present.
Unfortunately, neither Scottish Government nor local authorities choose to make this information available. As I have previously argued, it is high time we started putting the data held in the registration system to work for the public benefit. One first step might be to follow the example of Newham Council in London and share the data with HMRC to help identify tax underpayment(2). Since any additional tax revenues generated would flow to Scottish Government, they would seem to have a clear interest in pursuing this.
Table 1: Summary of survey-based estimate of number of landlords
No. of properties rented from private landlords in Scotland, 2015
Scottish Household Survey 2016 report: Table 3.1
Percent owned by individual/couple (rather than company)
Crook et al (2009: p29)
Number owned by individual/couple
Average portfolio size for individual/couple
Derived from Crook et al (2009)
Number of portfolios
Split between individuals and couples
Percent owned by individual
Crook et al (2009: p29)
Percent owned by couple
(Ignoring cases with 3+ owners)
Estimated number of individual owners
1) Landlords: Taxation:Written question - 105116 on the UK Parliament website. Curiously, HMRC gave a response to a Freedom of Information request just a few weeks earlier which suggested there were just 73,000 people in Scotland paying tax on property income. While they have said they believe the later response to the PQ to be more accurate, they have not yet explained why there was such a massive discrepancy.
One of the most interesting areas of research at UBDC - particularly within our iMCD programme and subsequent leading-edge research based on iMCD data - is the use of wearable devices by research participants to gather detailed data. A small subset of folk who had taken part in the Household Survey element of the iMCD project also spent a week wearing GPS tracking devices and lifelogging cameras everywhere they went. We soon found that others working on a variety of research topics wanted to not only use the knowledge we had gained from doing this but also to borrow our wearable devices to collect data. Lavinia Hirsu, a researcher at University of Glasgow’s School of Education, was one of the first. Here’s her story.
Two years ago, I attended an introductory seminar at UBDC on multiple sources of dynamic data collection and innovative methods and methodologies, all developed with the help of cutting-edge technology. At that moment, I was at a turning point in my research and I was looking for innovative approaches to study students’ digital practices in the university context. I wanted to adopt a fresh perspective to understand how students engage in the university with their devices and academic work. The generations of students that we welcome in our courses are users of digital technologies and have incorporated the digital and the analogue in seamless ways in everything that they do. To capture these practices, I wanted to be able to track students’ activities on campus, at home, in the streets, and everywhere else they carry their academic work.
The iMCD Project sparked my interest in the huge potential of using lifelogging cameras in my research. The support from the UBDC perfectly aligned with my research goals and I was able to conduct a pilot study where I tracked the activities of three international students over the course of a week. The participants agreed to wear the cameras and record their activities, and the resulting data included a total of 50,000+ snapshots. The UBDC team provided valuable technical support and Dr Catherine Lido, who worked closely with research staff in the School of Education, helped me with extensive guidance on the ethical dimensions of collecting and handling this type of data. To protect the identity of my participants and their data, I developed a protocol that gave the participants the possibility to delete unwanted private images. I also made sure that the photos that were to be used in the project followed the most recent research guidelines for identity and privacy protection.
The cameras were spot on! The images captured reveal a fascinating range of practices and habits that sit at the border of the digital and non-digital. The findings from this project show that students use their devices purposefully, perfectly aware of their potential to create distractions and interruptions of their learning flow. When it comes to academic work, students draw upon different sources of knowledge which come from the traditional books and journal articles, as well as their Facebook feed and other social media sites. What’s more interesting is that the archive of images goes to suggest that students operate in post-digital environments. Their digital and non-digital activities are not separate but integrated at all times. Students use all possible devices (from pen and paper to computer and mobile phones) to create knowledge. It is almost impossible to simply describe what students learn from Moodle OR their textbooks OR social media. Knowledge is formed from all these brought together.
I have had the opportunity to share the preliminary findings of my research at an international conference (College on Composition and Communication Annual Conference, Portland, USA) and in the UK at the 9th Conference of the European Association for Teaching Academic Writing (Royal Holloway University of London). Both presentations were very well received and attendees were very much interested in learning about the methodological aspects around the use of the lifelogging cameras. I am currently preparing a conference presentation on the same project at the Scottish Educational Research Association Annual Conference and a journal article. While this was a pilot project with a small number of participants, given the positive response that I received, I plan to seek funding for a larger project. This will allow me to work with the UBDC resources and expertise to investigate students’ practices in post-digital environments at a larger scale and for a more extended period of time.
Clare Tilden (Data Sharing and Supplier Manager at ONS), said: “We have enjoyed working closely with the Urban Big Data Centre, who so helpfully supplied us with the Zoopla data they have been collecting for research use. This collaboration is an excellent example of publicly funded organisations sharing resources and expertise for the wider benefit of society. We hope these initial experiments using big data from UBDC will expand into other areas soon”.
How much does it cost to rent a private property in your area?
I’d really like to be able to answer that question, and so would housing policy makers in local authorities up and down the country. Given that the size of the private rented sector has near enough doubled over the past decade, it might come as a surprise to you that we can’t actually measure average rent prices for small areas using official data. At the risk of embarking on months of migraine-inducing data wrangling, I asked myself a dangerous question. How hard can it be?
Long story short: Quite hard.
There are already some official stats that can help point us in the right direction. First, there’s the snappily titled Index of Private Housing Rental Prices, which tells us about inflation of rent prices at the national and regional level. Then there’s the Valuation Office Agency’s private rental market statistics, which provide an indication of average rent prices at the local authority level based on a sample of rental data. Trouble is, neither of these was originally set up to provide small area statistics (because of the model-based method and the sampling respectively) and that’s what policy makers are now really after in trying to understand their local areas.
There is some hope though. A few years ago we started using admin data on property transactions from the Land Registry to produce House Price Statistics for Small Areas (the first rule of housing stats is that they must take longer to pronounce than produce). The address-level data we use for this allows us to produce small area statistics on the price paid for residential properties. Since then, a whole range of address-level admin data has become available, including some property website data which might just help us crack the small area rent price problem.
Zoopla. It’s pretty much the Amazon website of the UK property market these days, and as such holds data which offers a rich source of information about properties for sale and for rent. We’ve used these data to work out the average advertised monthly rent price at the small area level for the whole of Great Britain between 2010 and 2016. Let’s call this ‘Advertised Private Rent Price Statistics for Small Areas’ and maintain the theme of ridiculously long titles shall we? We’ve also looked at the number of rental property listings over this period, to get an indicator of rental market activity.
Fig 1: Understanding Zoopla Data
Our initial analysis of the first outputs suggests that the data do provide us with a reasonably comparable set of rent prices to the current official statistics for the larger geographies. The fact that we have some first outputs to analyse also suggests that we’ve found a broadly sensible way to handle and geo-reference the rather large amount of admin data.
Like all good statistics though there are of course many, many caveats and limitations which we need to understand and hopefully overcome. Here are just a few of those, starting with the most challenging:
The data relate to properties advertised for rent and so doesn’t cover all rented properties. The difficulty is in describing the extent to which advertised rent prices are representative of all rent prices.
Not all properties advertised for rent are included in the data. Some letting agents only use their own website, or other property websites, to advertise properties for rent.
Not all rented properties are advertised at all, anywhere. A substantial part of the private rented sector is casual, and these properties may never appear in property website data or any admin data.
Some areas appear to have no properties advertised for rent in the entire time-series of data and we don’t really know why.
We now need to work with the users of rent price statistics to find out if this new output meets their requirements, and whether we can use other data sources to overcome some of the limitations. We’d also like to produce information about the rent prices of different property types and sizes. Ultimately we aim to have a suite of rent price statistics that are comparable across geographies, which will help policy makers genuinely meet the demand for housing in their areas.
In the statistical utopia of the future, I suspect all this will be possible automatically and programmatically from a constant stream of linked and open big data. Until that happens, I still have a job.
Individual-level data on the social care people receive has been collected by local authorities in Scotland for the last seven years. Apart from an annual aggregate social care survey collected and published by the Scottish Government, this data is seldom used. We now have a UBDC PhD project fully under way, with the support of the Scottish Government and Renfrewshire Council, to dig deeper into social care and health data and improve planning and services.
Danny McAllion, Data Analytics and Research Manager at Renfrewshire council, said "This research will help move forward our understanding of how social care is currently delivered and assist us in improving the design and delivery of our services. Working with the UBDC has been a valuable experience and a good example of academia and Local Government sharing resources and expertise. We'll look for opportunities to work with them again in the future."
Why Look at Social Care Data: Planning for Scotland’s Health and Social Care
Co-funded by the Economic & Social Research Council (ESRC) and the Scottish Government, this research is exciting for a number of reasons. This will be the first time (other than pilot, proof-of-concept projects) that social care data will be linked to health data on such a large scale in Scotland. It will provide unique insights into the interaction of health and social care services. This is timely given that services across Scotland are being radically redesigned and integrated with the aim of reducing expensive unscheduled health care use. The research has, therefore, the potential to identify important areas for policy development. It is also the first time that an appraisal of social care services by socioeconomic position has been conducted, giving the first insight into whether inequalities in access to services exist. Given reduction in inequalities is one of the main priorities of the Scottish Government's national outcomes - the importance of this analysis is obvious.
What We Currently Know
The Scottish Government have published anonymised social care data from the 2010, 2011, and 2012 social care surveys as open data. To ensure individuals cannot be identified the data has been banded and some statistical disclosure methods have been applied, however, there are still some interesting results to be derived from the released files.
Subsetting the data to include only over 65s and attaching population estimates from the National Records of Scotland shows there are marked variations in the proportions of older people receiving home care (e.g. help with washing and dressing) across the country as shown in Figure 1.
When we compare the difference between the amount of weekly hours home care clients receive, we again see wide variations in different local authority areas (Figure 2). In some areas over 60% of clients receive less than 4 hours of home care whilst in others the proportion is a little over 20%.
What causes these variations? Are there such large differences in social care need across differing local authority areas that account for the difference in provision? Or are there differences in local eligibility criteria? Are councils, under considerable budget pressures, responding in different ways? Or is this a problem with data collection and quality - are the numbers a true reflection of service delivery?
Finally, perhaps the most important question - does the inverse care law, already observed in primary care in Scotland (by Mercer et al and McLean et al), also exist in the social care sector? Is access to social care harder in areas where there is the most need?
The inverse care law: “The availability of good medical care tends to vary inversely with the need for it in the population served.” (Tudor-Hart, 1971)
These questions come at a significant time: health and social care services are integrating following legislation introduced by the Scottish Government in 2016. The legislation acknowledges that multimorbidity, the presence of more than one long-term condition in an individual, is the norm for those over the age of 65 and that the current single-disease framework of medical care is no longer suitable. We believe those with multimorbidity also have significant non-medical needs (such as social care). Integrated care services should aim to improve the lives for these people but our knowledge of how individual’s health and social care needs overlap is limited.
These issues are at the heart of my PhD. Do levels of socioeconomic position or levels of multimorbidity within local authorities help explain the wide variations in levels of care shown above, or are there other factors at play? Only linked administrative data can help us answer this question.
Getting Under Way with Renfrewshire Council
September 2017 has seen the transfer of individual-level social care data, spanning ten years, from Renfrewshire Council to the national safe haven as part of my UBDC-based PhD project. Facilitated by UBDC’s Controlled Data Service, this partnership with Renfrewshire Council aims to gain a deeper understanding of the provision of social care, and to assess the suitability of this type of data for both academic research and local planning purposes.
The safe transfer of the social care data has been the result of close working with partners at UBDC and the council. Initially, along with my supervision team, I approached the council about the type of social care data they held. Further discussions then took place to address the feasibility of answering my research questions with the available data. Once agreement had been reached that the project was feasible, a formal approach was made to UBDC to facilitate the project and the necessary approvals and ethical requirements completed. A data sharing agreement between the Council and the University of Glasgow was signed, which finally allowed the transfer of data to take place.
This phase also has benefits for the council. Danny McAllion, Data Analytics and Research Manager at the council, said "This research will help move forward our understanding of how social care is currently delivered and assist us in improving the design and delivery of our services. Working with the UBDC has been a valuable experience and a good example of academia and Local Government sharing resources and expertise. We'll look for opportunities to work with them again in the future."
Progress and Plans
Work on the first (Renfrewshire data) phase has started and transfer of data for the main linkage project is expected before the end of 2017. Procuring and linking administrative data can be a long process with many approval and regulatory hoops to jump through to protect privacy and security (the process can take up to 12-18 months). However, as detailed above, the potential benefits of linking administrative data make this a worthwhile process and UBDC staff are really helpful in enabling this process. Watch this space for further news on the progress and impact of this work!
The Urban Big Data Centre partnered with the Office of National Statistics Data Science Campus and The Alan Turing Institute in putting on a Data Dive in July 2017.
We supported this hackathon-style event by providing Strava, Zoopla, and other data from our collections for participants to hack with. We also supplied our resident Geospatial Data Scientist, Rod Walpole, to help with the data and support the participants. The event was for UK-based PhD students, postdoctoral fellows and early career researchers in data science, and we were ably represented by our UBDC Research Associate from the University of Glasgow’s School of Mathematics and Statistics, Dr Francesca Pannullo. So ably, in fact, that Francesca’s team won second prize! More on that below, including her prize-winning entry on building more homes for the UK.
Francesca on the event and the prize-winning entry:
I was invited to The Alan Turing Institute in London co-hosted by the ONS Data Science Campus to take part in a two-day policy-focused data dive to design data science solutions to some real world challenges that are facing our societies today. This data dive focused on issues that urban environments face, such as widening health inequalities between the rich and poor, decreasing green space, and decreasing locations where new houses can be built. These three issues were each assigned three teams of researchers encompassing a vast array of skills, using a range of data sources available from the Satellite Applications Catapult, the Urban Big Data Centre and ONS.
I took part in the housing issue to answer: where can we build more houses? My team developed a procedure to locate areas that could be used for building houses. This involved Zoopla, planning permission and green space data being analysed through GIS and mapping software as well as analysing satellite data at different wavelengths to locate specific sites, such as brownfield land (previously developed land that is not currently in use), that could be used to build houses.
This data dive allowed networking between people with numerous skills, and highlighted that combining these skills and working together can help generate concise and novel ideas. My team was awarded second prize for our novel approach at combining numerous different data sets and developing a clear methodology for locating new building areas that government bodies can adopt, while helping shape policy decisions with regards to the housing sector. I have summarised our prize-winning entry below.
There are numerous pressures on the UK housing market, leading to heavy discussions of how the housing market can be improved to benefit the population as a whole. One of the main issues is in terms of finding available and suitable locations that are able to be built on and are completely free of any potential restrictions, such as green space land, or particular areas that have to conform to wildlife protection or land-use issues. These restrictions mean it is costly and time-consuming to identify and locate available areas, which then adds to the pressure of new building contractors being able to actually start building.
Our potential solution
To develop a methodology or procedure to identify potential areas where houses can be built. Numerous data sources were available as part of the data dive; such as Sentinel satellite data, local planning application data from ONS, green space data from UBDC, and Zoopla data including both private rents and housing sales from UBDC. These data sets focused on the cities of Glasgow and Manchester, but my team and I decided to focus on Manchester due to the satellite data for Glasgow being too cloudy! Typical Scottish weather!
One of the researchers in the team was a physics PhD student and was used to analysing satellite data using Python, so he took on the role of assessing Manchester through different wavelengths in order to locate potential areas, such as brownfield land.
Myself and the rest of the team worked on utilising the planning application data as a way of locating potential areas that are already being monitored for numerous land uses. The planning application data only pertained to Manchester city, so this became our new area of focus.
We then made use of a classification system developed for the satellite data, which essentially classifies areas in terms of what they are being used for. This meant we could filter out all areas that were already developed, leaving the areas that could potentially be available for house building.
By combining these classes with the underlying map we were able to filter out potential areas of re-development.
Furthermore, it was beneficial to also filter out areas of green space, since these are areas that cannot be built on. This then gave us an overview of areas that could potentially lead to housing development (image below). However, with more time to work on this one could zoom into a potential location and use the Zoopla data to identify local housing sales and rental prices in order to ascertain whether it would be worth building in this particular area. Using the planning application data could further identify an area as being potential for house building if planning applications in the surrounding areas are successful. Both Zoopla and the planning application data are useful for showing activity, and supply/demand levels in areas, thus possibly aiding contractor’s decisions on whether it is worthwhile to build or not. These could also provide market signals in order to help determine the quality and validity of the potential location.
As a team we wanted to think out-of-the-box and brainstorm other ways of locating potential areas to be built on. This led to the popular idea of crowd sourcing, which is a fun way of involving the public in helping shape the UK’s housing market. Participants could be asked to tweet geotagged photos of potential areas or sites (e.g., unused retail land) in the neighbourhood using a specific hashtag (we tried to be funny and come up with our own hashtag to be #rustyrusty), which could then be used as a training set to validate and locate potential areas and sites.
Overall this was a very positive experience and I absolutely loved my two days in The Alan Turing Institute. I got to work with a host of researchers with different skills, and of course, meet a lot of wonderful people. It was an intense couple of days trying to get everything finished on time, but everyone who was there to help made it a very enjoyable experience and very much provided sound advice the entire way through. The food at the event was absolutely delicious! And the constant tea, coffee and food was much needed to fuel everyone’s brains working in overdrive. I would like to say a big thank you to The Alan Turing Institute, ONS Data Science Campus and UBDC for organising this event, providing the data and expertise for making this an excellent experience.