AI and Ethical use of data: Data Validity Part 1

As mentioned yesterday I had the pleasure of taking a course on Ethics and Data Science.  Given that data science is a key area of the Machine Learning area of AI I thought I would expand on the subject as a starting point.  Each bullet I discuss has a more detailed discussion required.  But I truly believe that Ethical/Business Conduct requirements will be needed for all AI projects in order to provide transparency and explainability.

IMG 1823

So what are the risk considerations for ethics in AI ?  You will notice that there is overlapping considerations that have to be thought of and included.  

  • Data Validity
  • Algorithim Fairness
  • Informed Consent
  • Model Errors
  • Societal Impact
  • Ossification/Rigidity of ML models
  • Surveillance Impacts
  • Managing Change
  • Regression
  • Bias/Variance

So as an example of data validity:  and I am starting to see this criteria being included since it has a high risk or legal ramification.  In this day and age of access to third party data sets ( legally and illegally ) or the data sets that you have collected as an organization.  Are you questioning where the data comes from and if proper “informed consent” was given by individual providing that data or information ?  Did third party organizations thoroughly vet and validate the information ?  Has it been modified or redacted or scrambled ?  Can you still identify individuals or information by extrapolation ?  What proof will stand up in court if you are sued for accessing information not properly vetted by a third party.  Are you moving data from one line of business ( sales to marketing ) and are you violating any agreements that you have with clients or leads ?

 The course I took was done a few years ago and had a very optimistic tone to it that regulation would slow down innovation or drown skills; the recommending direction from the professor at that time was: don’t surprise people with outcomes and be able to explain how the model got to that outcome but leave how and what we analyze to the data scienctist.  I think we will see regulations step in.  Any time you have a practice: medicine, legal, engineering, real estate or accounting regulation has to be in place to protect human rights and the individual human.  

Data before Decisions…

I have been reading the book “A Field Guide to Lies” by Daniel Levitin.  Critical Thinking principles for the information age.  Some of the discussions about statistics and probability is fundamental but really who does not need a refresher.  The underling principle is that with the amount of information and the access to information being so easy; it is easy to see that many people will try to bend or present “facts” or “news” in ways to make you believe them.

I find that this book really does provide a refresher for many of us in understanding how information is presented and that even professionals ( doctors, journalist, politicians ) can be duped easily. 

Take conditional probability: we have seen again and again that people don’t apply it’s principle properly or even at all in making decisions and therefore have made devising decisions on the general population or even in war.

So the point of this is that in order to make a decision today you have to have a way of reducing the “noise” and to implement correct models to analyze information: either data or news or any type of information. 

And now with the ability of technology to understand and analyze all types of information and to apply the proper analytics method; people will be able to make better informed decisions.

I raise this issue since we seem to living in a time with so much false information or skewed organizations biasing the information to make people make wrong or misinformed decisions.

Thoughts ?


%d bloggers like this: