Category Archives: Data Quality

Time for Zero-Tolerance

I heard the blindingly obvious statement at the recent Intellect stream of BCS HC2013 Conference in Birmingham that “If you give people software that makes their life easier then adoption is not a problem” Given the clear truth of this statement, I wonder why so many “designing” and procuring NHS IT Systems manage to deliver systems that make life more difficult for frontline staff undermining the quality and service they are able to offer patients.

If we are to get the benefits proposed by those promoting a paperless NHS this has to end and we have to introduce a regime of Zero-Tolerance for systems that make life more difficult for frontline staff.

How do we do this? I have some suggestions:

  • Transparency – Let’s encourage NHS Staff to name and shame systems that make their life more difficult, with reviews and rankings published with maybe some new categories in the EHI Awards for good design and something akin to the “Golden Bull Awards” for the worst examples. Let’s be honest, when the Emperor has no clothes, too often I hear positive comments about NHS IT systems in front of senior management which evaporate when you talk to frontline staff in private. Both sides are to blame here, management can’t be expected to recognise the problem when they are continually told that the “heap of crap” they ask staff to use is “powerful fertiliser” and senior managers need to reward those who bring them bad news.

The work of Psychiatrist Dr Joe McDonald, who is one of the most insightful observers of UK Health IT, with www.comparethesoftware.co.uk could provide a good start as could ideas from NHS Hack Day about a broader NHS Bugs reporting system.

  • Participation – We need to engage in agile user centred design that allows developers to truly understand and deliver against frontline end-user needs. This means embedding domain experts and end-users and their customers/patients in development teams through the entire product life-cycle (but not for so long that they “go native”) and careful testing of software in real situations with realistic data.

We also need to involve frontline staff in procurement decisions, but not in the tokenistic way that often happens, at the very least they need a veto and ideally it should be they not IT departments or senior management making the final choice.

  • Design – Design, design and more design. Good design ensures form-fit-for-function and should create software that is lusciously desirable, highly functional and which adapts to and remembers users’ needs and preferences. Design needs to be all pervasive. Software code should be finely wrought, poetically beautiful and lovingly commented adhering to clear and consistent coding standards. Workflows should careful designed to provide efficient process flows and be capable of adapting to real-world variation in process. Information presentation should apply the science of information design and excellent graphic design. User-interface design should similarly apply both science and art, to create interfaces that are beautiful, intuitive, accessible and safe. Good design requires input in the development process from technical, creative and human-factors designers. There is a wealth of literature and experience to help system designers if they just go and look for it.
  • End the dataset mentality – Except in exceptional circumstances only the information that is necessary and appropriate to the delivery of optimal care of the individual patient should be collected. This includes information to support both clinical and supporting business processes and the measurement of outcomes, but not data for purposes which can’t directly contribute to good outcomes for the individual patient. This information varies from patient to patient and encounter to encounter and can’t be useful expressed as a mandatory minimum data set. Data sets are never minimal but inevitably fail to include information that will be critical in some cases. The quality of data collected under duress which has no value to those collecting it, will be poor as users will guess it or make it up if it is not immediately to hand making it worse than useless for the purposes for which it was included in the dataset in the first place. Users have a natural investment in quality of data that affects the ease with which they can do a good job and typically such data is information rich and secondary uses can in general be better satisfied by using only such data. However, in exceptional cases where there is a strong downstream benefit from the collection of additional data this need must be “sold” to users who should also be provided with regular feedback that demonstrates that efforts are worthwhile.

While there is massive potential in using data collected routinely for secondary purposes, those doing so must always be aware of the risks of using data for purposes other than those for which it was collected. If those entering data don’t know about the other ways data they are entering will be used they can’t help ensure that it is also fit for these purposes. I would not go as far as the eponymous Dutch health informatician who coined Van de Lei’s law, “Data should not be used other than for the purposes for which it was collected”. However, it is important to understand the many data quality issues associated with health data  and the risks of using data beyond the “Hawking Horizon

One way of mitigating these risks is to inform those who collect data how it will be used and to feed back to them the results of its use. This will not only encourage them to consider how they can make fit for these secondary purposes, but will also mean you are more likely to be alerted when you draw conclusion that those close to the data know they do not support.

  • Finally of course Zero Tolerance – Zero tolerance for systems that make life harder for frontline staff, zero tolerance for those who fail to involve frontline users in design and procurement decisions and zero tolerance for the dataset mentality and most of all zero tolerance for poor design.

NHS-Life Sciences Partnership

“The NHS should be “opened up” to private healthcare firms under plans which include sharing anonymous patient data, David Cameron is due to announce”
http://www.bbc.co.uk/news/uk-16026827

25 years ago I launched AAH Meditel. My plan was to give GPs free computers in return for anonymised patient data, which I planned to sell, primarily for life-sciences research. Today’s endorsement of this concept by Prime Minister David Cameron is therefore one that I welcome, but with some critical reservations.

AAH Meditel was successful in establishing a large database of over 5 million patient records and one competitor VAMP (now part INPS), who launched at the same time, did something very similar. The commercial models didn’t work (we were too far ahead of our time in so many ways) but it is the process we started, later built upon by others (notable EMIS) that has provided the foundations on which today’s announcement is made.

Over the past 25 years I and others in the primary care informatics community have learnt a great deal about the issues associated with building a longitudinal “cradle – grave” record and in particular those that arise when you start to share it and use it for both primary and secondary purposes distant from those purposes in the minds of those who created the record.

The value of this record is created by the willingness of patients to divulge often sensitive information to healthcare professionals. They do this primarily to get the care they need, but we also know that when asked, the vast majority are happy for it to be used for other purposes, particularly medical research, as long as all practical steps to protect their privacy have been taken. David Cameron has made it clear that such steps will be taken, but I have little confidence that Government understands what is necessary and possible or that the research community go much beyond lip-service in their attempts to address these issues. It is clear to me while the research community has no need or desire to compromise patient privacy it also has little willingness to take the problem seriously and risk creating a public backlash and worse, undermining patient confidence in the doctor-patient relationship that lies at the heart of health care.

I want to see health data used to support the British life sciences industry, but more importantly I want to protect patients’ confidence in their relationship with those who provide their healthcare. I believe if we get it right we can have both, but to do so we have to protect certain key principles:

1. The use of patient data for research is a privilege that patients grant not a right for researchers to take. Patients must be able to opt-out; we know that very few will choose to do so and by denying those who wish to the opportunity we create much unnecessary conflict.

2. It is not a simple matter to protect personal information and comprehensive anonymised data can often be easily re-identified. It is important that those concerned properly understand the risks and how privacy enhancing technologies can mitigate these risk if applied as part of an appropriate governance framework.

3. There must be an acknowledgement by the research community that their first duty it to respect the wishes of patients and the privacy of their data, not their research.

4. That we recognise while health data is a valuable resource its fitness for purposes distant from those for which it was collected is not as great as some might believe. We have much work to do to understand and improve the quality of data (see my blog http://wp.me/p1orc5-15 and http://wp.me/p1orc5-13 )

The BCS Primary Health Care Group published a discussion paper in March this year which I think provides a good starting point http://www.phcsg.org/main/documents/PrivacyandConsent.pdf

BCS Health have a much longer document in preparation “Fair Shares for All” which should appear soon. This provides an extensive review of the issue including a comprehensive review on patient attitudes on which I draw in making some of my statements above.

Let’s make the most of the opportunity, but please, be careful out there. Privacy is a fundamental human right, and should not be treated as an inconvenience by those wishing to use patient data for purposes other than care.

In Summary

Record summaries can play an important role in the facilitation of clinical communication, but only with careful thought, which is often lacking, as illustrated by the sad tale of the English Summary Care Record (SCR) and the happier tale of the Scottish Emergency Care Summary.

The first thing to understand about a “summary record” is that the term is a meaningless. There are potentially a practically infinite number of summaries that could be created from an entire record, but for a summary to be useful it has to be for a clear purpose understood both by those creating and those using the summary. Scotland got this right with a clear purpose “Emergency Care” now be successfully extended for another clear purpose as the “Palliative Care Summary”. While the English SCR is basically also an emergency care summary it has been touted as suitable for a whole range of purposes including many for which it not fit (“when you only tool is a hammer all your problems look like nails”) and while this is not the only reason for its’ lack of progress this lack of clarity of purposes remains a key problem for the SCR.

What then is a record summary and how can we describe the various types of summary that might be useful?

First, let’s consider the difference between and integrated a standalone summary. The distinguishing feature of an integrated summary is that the rest of the record from which it is drawn is immediately available. i.e. you can “drill down” into the summary and see the detailed information that underpins it. Today, such drill down facilities are usually only available when the record and summary are part of the same system, but this not need be the case and in the future it will become increasingly common to be able to drilldown to data held in other systems. With a standalone summary the user only has easy access to the information held in the summary and it thus has to contain all the information that might be needed to support its purpose. Clearly a different approach is required in the design of a standalone summary compared to an integrated one. The summary view in a GP system is an integrated summary while the SCR and ECS are standalone summaries.

I also find it useful to think of summaries as being either horizontal or vertical and while this is sometimes an oversimplification I find it useful. A horizontal summary is one that is wide but shallow it contains top-level information from many parts of the record. The SCR, the ECS, and the summary view in a GP system are all horizontal summaries. A vertical summary is narrow but deep containing most or all of the information from a few parts of the record. A medication summary would be an example of a pure vertical summary. A more complex example would be a disease specific summary, say a diabetic shared care record. This is a summary of the record containing detailed information relating to diabetic care (and thus vertical) but would also have summary information from other parts of the record (and thus have some horizontal components) the Scottish Palliative Care Summary is another example of complex summary.

My view is that to be really useful summaries should be statefull i.e. that they should reflect the current state of the patient. With a simple summary like a medication summary it may be possible to maintain statefullness automatically, but in more complex summaries a degree of human intervention is usually required to maintain statefullness and this is well described by Ian McNicholl in his blog http://bit.ly/dT3d8u where he talks about the maintenance of a “meta-narrative”. Some summaries are stateless and just represent a journal of all activity within the scope of the summary. The Spine Personal Spine Information Service (PCIS) and the abandoned London Shared Record were such stateless records, just a spike onto which a range of clinical correspondence had been placed. Such summaries have some value but to understand the current state of the patient the user has to read back through the record and risks missing significant information buried at the bottom of the pile. Such summaries have the advantage that they can be automatically created drawing on a large range of sources, but in my view unless supplemented by an appropriate statefull summary are of limited value. Some proposal for the SCR would combine the statless PSIS with the GP maintained statefull SCR as it is today.

One of the key uses I see for a summary is the sharing of information and the management of a care plan and care pathways over the Hawking Horizon http://bit.ly/jAoFWV (a patient is often on many pathways but there should only ever be a single integrated care plan) Ideally such a summary should operate under shared governance and be the result of considered publication in to the shared space by all of the actors involved in the care plan (including the patient and members of their informal care networks) systems should facilitate the creation of such summaries automatically updating those parts that can safely be automatically maintained, but will require active maintenance by the human actors involved. The systems managing the summary should provide mechanism for reconciling information coming from multiple sources and resolving differences of views (some analogies exist with the Wikipedia approach which allows the resolution of differences on a discussion page behind each article) but should be able to represent dissonance where resolution can’t be achieved.

Summaries are often implemented as read-only and this approach certainly simplifies the technical and governance issues associated with keeping a summary in sync with its source systems, but it might be desirable to allow information to be edited or entered into the summary with mechanisms to update source systems but any mechanism should not pollute source records with information they don’t need, want or consider of adequate quality.

Summaries could usefully supplement and integrate with the stateless status feed proposed for Fredbook http://bit.ly/lfq3f3 and form part of the rich online environment in which patients, informal care networks and healthcare professional come together.

Secondary Uses of Data – A Poachers Tale

Early in my career in health informatics I had plans to make myself fabulously rich by selling pseudonymised patient data from GP for a range of secondary purposes. I managed to spend £15 million of my backers money giving away 1000 GP systems and established a database of 6 million patients records it all ended in tears (at least from the financial perspective) in 1992.

In those early days I had a naïve view of the extent to which pseudonymisation could protect patient privacy and the ease with which data could be used for secondary purposes, so in this world I am very much a poacher turned gamekeeper, but one that still believes in the massive benefits that could flow from intelligent secondary use of patient data.

In this blog piece I won’t dwell on issues of patient privacy, suffice it to say for now, that I don’t now believe that pseudonymisation of rich datasets is fully effective but I do believe that with a sophisticated approach that we can adequately protect patient privacy when we use their data for secondary purposes. What I want to concentrate on here are the challenges of using data for secondary purposes that have nothing to do with the need to protect patient privacy.

There are two ways in which we might consider the use of data secondary, the first is that the use is not directly connected with the care of the individual patient whose data it is and the second is that it’s a use not of direct concern to the person collecting the data. Here I want to concentrate on the second definition uses with which the collector of the data is not concerned and indeed may even be ignorant of. (There are clearly some secondary uses in the sense of the first definition with which the data collector is very concerned – maybe their own research interest.)

There are a number of issues that need to be considered when using data for secondary purposes.

• What were the primary purposes for which the data were collected and how do the requirements of these primary purposes fit with the proposed secondary uses?

• Is there a conflict as to how something is best recorded for the secondary purposes? The requirements of the primary use should and will prevail

• How aware is the data recorder of the secondary purpose?. Awareness may encourage the recorder to take more care that the data is fit for the secondary purpose or may result in a range of gaming activities when they have motivation to “spin” the results of the secondary use either to their own benefit or that of the patient. E.g. blood pressure readings clustering just below the QoF cut of point.

• How important is accuracy in the recording of data to the recorder? A particular issue where users are forced to record data by system design or management pressure. If you have to record something but the accuracy of record has no direct impact on you then you may guess or make-up data or just type any old rubbish to get past a mandatory field for which you don’t have valid data. E.g. A GP recording prescribing details will take great care to record the information accurately as this will be use to produce the prescription and errors would create a serious patient risk, whereas they might be tempted to just guess to complete a mandatory dataset where they don’t see value in recording the data.

• Are definitions shared between the primary and secondary purpose and between different recorders, have they even been told what assumptions about definitions have been made? Researchers are typically much tighter that frontline data recorders e.g. some clinicians will record a diagnoses of “asthma” on the basis of limited clinical findings, just because it is probably right while others will want further confirmation and just record it as “wheezing”.

• System design and configuration can have a profound effect on what and how people record data and the extent to which they code data. Most work using data from multiple GP systems assumes data across different systems are directly compatible when the evidence suggests this is often not the case – Work by Professor Simon de Lusignan based on video observation of many consultations shows a four fold difference between the major systems in the number of consultations with no coded data and a two-fold difference in the average number of codes used http://bit.ly/kO5tgw . He also found that the way different systems mange pickings list had a significant effect on the data entered http://1.usa.gov/itbtc5 Secondary uses have to take account of system biases.

This bring us to Van de Lei’s law, coined by the eponymous Dutch health informatician “Data should not be used other than for the purposes for which it was collected” While I would not take this extreme position (and I suspect Van de Lei said it to emphasise the point, rather that to be taken literally) There are significant challenges in using data where the use is not one that was in the mind of the recorder when they recorded it.

There is a massive growth of interest in health analytics based on data extracted from GP systems. Data quality is adequate for many of these purposes but not as good or consistent as some secondary users seem to assume. While there can be dangers in telling recorders about the secondary uses to which the data they enter will be put in most cases these are greatly outweighed by the benefits of making recorders aware of secondary uses and trying to secure their cooperation to make sure what they enter is fit for the secondary uses to which it will be put.

Users of data for secondary purpose beware.

Beyond the Hawking Horizon

The idea that a single shared electronic health record (SSEHR) operating over a wide geography serving many care settings and diverse professional groups is a good idea is one that has some currency in the NHS. However, evidence seems to be growing that this approach does not lead to more effective care and communication and brings new problems of it own.

Myself and colleagues in the British Computer Society Primary Health Care Group (PHCSG) have been struggling to untangle the issues that flow from SSEHR and have contributed to guidance on their use intended to help achieve a better balance between the benefits and problems they bring. However, after much debate I those of my colleagues involved in this work have concluded that the SSEHR is a fundamentally flawed idea and one that we should not pursue further.

As always with our debates we have struggled with the semantics of our discourse. What is a record?, what is an EHR?, what do we mean by a SSEHR? and what differentiates it from a EHR?. So first some definitions.; there are various terms in use for EHRs these have subtle differences in meaning that are not always agreed or understood; EHR, EMR, EPR, PHR and HER (the last created by the default auto-correct setting in MS Office) I’ve wasted too much of my life on these definitions so I am going to call them all ExRs and let others botanise about them.

So what then do I mean by an SSEHR. Sadly, applying common meaning to the name is misleading. It is Single, in that it is the main record of prime entry and reference for those that use it. (So it’s not a summary record or a consolidated record created from other records of prime entry). It’s shared, but then with a few very limited exceptions all records are shared (indeed the facilitation of sharing is one of a records main purposes) but to meet our definition of an SSEHR it has to be shared widely both geographically and functionally, certainly beyond a single organisation or care setting and across also across diverse users. It is this degree of sharing that differentiates an SSEHR from other ExRs and which is the root of it problems.

SSEHRs are shared beyond a single domain of trust, beyond a single homogenous record culture and on too broad a scope for a single set of governance arrangements to be meaningfully applied and it is this broad scope of use in at the heart of the problems with the SSEHR. The first set of issues are around issues of data security, privacy and consent ,the second around record quality and the third around innovation and choice The first gets the most attention but while important I think these problems don’t represent the biggest challenge for the SSEHR, so In this blog piece I’m going to concentrate of the second set of problems around record quality. I shall come back to the other two sets of issues in a latter blog.

I’ll pick-up on a more detailed discussion on the definition of record quality and the purposes of ExRs another time, but for now lets just say that quality is about fitness for purpose and that ExR have a wide range of purposes. Even within a single organisation with a shared record culture and governance framework these purposes are not fully compatible and the record needs to be a compromise between these purposes which reflects the weight given to each by the users of the record. As the scope of sharing increases the dissonance between the various purposes becomes greater and the extent to which all users understand the purposes of all other users reduces and we reach a point where the utility of sharing starts to fall as the scope of sharing increases, I call this the Hawking Horizon in acknowledgment of my friend and colleague Mary Hawking who is responsible for so much of the best thinking about this problem. Where the Hawking Horizon is is open to debate and it position can certainly be affected by the quality of systems design, governance arrangements and user training, but the Hawking Horizon is clearly closer than the boundaries of many SSEHRs we are attempting to implement today. Probably, to keep within the Hawking Horizon a record scope should not extend beyond a single service or domain of trust (i.e. a GP practice, hospital department or community service) and we should look to other mechanism to share and communicate over the Hawking Horizon (other types of shared record i.e vertical and horizontal summaries and purposeful clinical communication – More about these in a later blog).

What then are the practical problems that arise when we try and push the scope of a shared record beyond the Hawking Horizon? Firstly, we get conflicts of purpose with user recording information in ways fit for their purpose but actively damaging to the purposes of other users. Some example reported to the PHCSG include:

• The recording of a rogue high blood pressure in an out of hours emergency of a patient whose blood pressure is otherwise normal undermining the QoF target for a GP
• The use diagnostic label “stroke” for every encounter between a patient and physiotherapists for rehabilitation treatment follow a single stroke distorting incidence data.
• The referral management centre who recorded a hysterectomy, as this was the reason for referral, which, if not spotted would have excluded the patient inappropriately from further cytology screening.

Secondly, we get irresolvable differences between users with no governance arrangements in place to resolve them. Again examples reported to PHCSG Include.

• The podiatrist who refused to remove a diagnosis of diabetes from a patient where the GP had biochemistry results which proved conclusively that the patient was not and untreated diabetic, even though she had a leg ulcer that the podiatrist reasonably considered to be a classic diabetic leg ulcer.

• The GP and social worker who could not agree on the diagnoses of bi-polar disorder, because the patient would not accept the diagnoses which the social worker consider to be a social construct.

All of these issues are potentially resolvable through better system design, clear governance arrangements and better user training, but in practice become irresolvable when the scope of the record gets too great, much better that each user shares their primary record only with those within their Hawking Horizon and uses other methods (described briefly above) to communicate beyond it.

When the record quality issues of an SSEHR are added to the security, privacy and consent issues associated with such records and considered alongside the ossifying effect they have on competition, choice and innovation, we really have to think again.

I shall return to this and associated issues in future blogs and try and describe some alternative approaches that make it easier to get the better more appropriated sharing of information and communication that can lead to better care.