Subscribe: Words and what not
http://ultimategerardm.blogspot.com/atom.xml
Added By: Feedage Forager Feedage Grade A rated
Language: Dutch
Tags:
article  articles  award  books  data  english  gerardm  information  library  open library  open  people  wikidata  wikipedia 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Words and what not

Words and what not





Updated: 2017-11-18T08:40:44.516+01:00

 



#Wikipedia vs #Wikidata - the George Polk Awards

2017-11-18T08:40:44.544+01:00

Some Wikipedians consider Wikidata inferior, so much so that they agitate towards a policy that bans Wikidata in "their" Wikipedia. They are welcome to their opinion.

I do bulk imports from Wikipedia and all the time I suffer the consequences. Some three to four percent of their data is wrong for all kinds of reasons, reasons that are manageable with proper tooling.

The George Polk Award is an award for journalism and it got my attention again because the International Consortium of Investigative Journalists received it for their work on the Panama Papers. I noticed that many people listed who had been awarded the Polk Award did not have articles in Wikipedia, that many of the link in the list of award winners pointed to the wrong person and that many award winners did not even have a "red link".

I am in the process of checking all the links and adding the date for the award. I found many issues among them a civil war general and many others false friends. I am adding items for the people who do not have an English article and, I have to check each of them because several do have articles in other languages. It is a lot of work and it is not as useful as it could be because Wikipedia hates Wikidata and we do not collaborate, we do not work together.

There is a Listeria list of winners and slowly but surely it will contains the information that is similar to the English Wikipedia list article. Similar but not the same;
  • the false friends will not be there, 
  • there will be no red or black links
  • people who won the award twice will be missing
Why do this, why spend so much time on one big list? Well, in this day and age of "fake news" we should celebrate journalism but having all this information in Wikidata allows for all kinds of tools as well. We can check for false friends, we can check if the articles on the award winners include the award but also if there are "winners" who are not known in this list and in the source available for the George Polk winners..

I am not a Wikipedian and truthfully I hate the endless and senseless bickering that is going on. So let me work on the data, make it available to tools. Now you Wikipedians, you may choose not to show Wikidata data in your infoboxes but you will not make your errors go away without collaboration. Yes, you can quote a source but when your data is not in line with what the source states, having a source does not do you good, effectively you provide fake information.

My request to the reasonable people at Wikipedia and Wikidata, let us work together and see how we can improve quality. Lets link wiki links (blue, red and black) to Wikidata and improve the quality of what is on offer first.
Thanks,
       GerardM



#Wikidata - women in red - May Wright Sewall

2017-11-16T09:02:59.434+01:00

On Twitter, it was mentioned that archival material of Mrs May Wright Sewall was being worked on. When you read the Wikipedia article, it becomes all too obvious how notable she was. She founded multiple organisations and was known for her suffragist ideas.

The article introduces these organisatons and consequently to indicate the relations, new items have to be created in Wikidata. I only did two and I added her husbands, men that supported her in her undertakings.

By adding these new organisations, it becomes possible to link more people to them. They thereby gain notability and it becomes more likely that at some stage they will get their article as well. The least new people and organisations added in Wikidata do is complete the tapestry of information of an age gone by.
Thanks,
     GerardM



#Wikipedia - #Retraction exposing big issues in #science

2017-11-15T08:25:00.606+01:00

When a scientific paper is published, it is read and cited by other scientists to further on science. It is read and cited by Wikimedians to write articles and share the sum of all knowledge. The Wikicite project provides better tooling for using these papers as a source in Wikipedia articles, it is one of the more relevant developments in combatting fake news in Wikipedia.

However.. there is an issue with a substantial number of papers; they were retracted. There are all kinds of reasons possible but the bottom line is; they are not to be used as a source in Wikipedia because its findings are false.

The challenge: what papers are retracted, how are retractions and the reasons for retractions modelled and how will we find these papers in the Wikipedia sources. Knowing retractions and acting on them will be a fine art; one publisher in South Africa for instance was pressed to retract a book exposing the president. There will be so many issues exposed once retractions become part of the Wikipedia work flow. Failing to do so will be the worst we can do. We will not be sharing the sum of all knowledge, we will be sharing the sum of what we are told.
Thanks,
       GerardM



Judith Butler in #Brazil - a reaction in the #Wiki way

2017-11-09T09:57:30.405+01:00

When the news has it that an effigy is burned of Mrs Judith Butler in Brazil, it is time to give some attention to Mrs Butler. There is information about her, papers she published and one way of adding to the relevance of Mr Butler is by increasing the people she is connected to.

In 2012 she was awarded the Lyssenko award. Adding that date and the other award winners works in two ways; Mrs Butler is better connected but the other award winners are better connected as well.

There is an article for Mrs Butler in English Wikipedia but given that it is a French think tank who conferred this award, chances are that not everyone on this list has an English article. There are projects that suggest articles to write.. Adding awards in this way may feed those projects. I hope so. For me that would be the best outcome that could be achieved.
Thanks,
     GerardM



#Wikipedia - Ischia International Journalism Award & the Polk Award

2017-11-09T07:30:28.397+01:00

When people win awards, they often win multiple awards. Harrison Salisbury won several awards not only the Polk Award. The Ischia award did not have a date associated with it. I used Awarder and the data from the Italian Wikipedia because that was most convenient.

There was no article for Mr Salisbury in Italian and consequently there was no date associated with him. Mr Salisbury is represented with a red link. It indicated 1990 and it was an easy manual edit.

As you can imagine, that red link could link to the information about Mr Salisbury on Wikidata. Showing this information to those who are interested in writing a Wikipedia article in Italian does provide pertinent information, information that should coincide with the new article. By comparing the information in Wikidata and in existing Wikipedia articles you know that the article is likely to be correct.
Thanks,
      GerardM



#Wikidata as a Wiki versus the data consumers’ perspective

2017-11-08T10:53:51.632+01:00

Wikidata is a Wiki. It follows that many people with many agenda's add data to Wikidata. It is a continuous process and as is usual in a Wiki, all contributions that fit the notability requirements of the project are welcome.

The consumers' perspective seen from a Wiki point of view is a bit awkward. There is nothing but active contributors that work towards any of the quality considerations. Even when there is a reasonable quality for some, it may not be enough for others.

Both Wikipedia and Wikidata are Wikis. Both have issues from a consumers' perspective. They are already explicitly integrated through the interwiki links and implicitly through the Wiki links. One of Magnus's tools makes this visible.

When you then consider George Polk and the George Polk Award it becomes obvious that Wikis have an issue from a data consumer's perspective. In some Wikipedia articles the two are conflated. In others there is a separate list of award winners. Many of the award winners do not have an article and some of the award winners refer to the wrong person. Wikidata could do with more data; the data was imported from Wikipedia and several of the wrong persons are still wrong in Wikidata.

Both Wikipedia and Wikidata consume each others data. Both are Wikis. There is no superiority in either project but they could compare their data and curate the differences.
Thanks,
      GerardM



#Wikipedia; Héctor Rondón did not win the #Polk Award

2017-11-07T08:01:13.018+01:00

This is Héctor Rondón, he pitches for the Cubs. He did not win the George Polk awardHéctor Rondón Lovera did.

This is a common mistake, it happens all the time and it is where Wikidata may make a positive difference to Wikipedia.. It just requires a different mindset to see why this is the right solution at this time. There are some loud Wikipedians that abhor Wikidata. This is an easy and obvious method that will improve Wikipedia and there is no sane argument why this would not work.

These Wikipedians do not even have to notice that this is done; we can hide it from them and still do a world of good. Not just for English Wikipedia but for all Wikipedias.. Ehm, for the readers of all Wikipedias.
Thanks,
      GerardM



#Wikidata - There is no such thing as a free lunch

2017-11-05T15:34:11.591+01:00

Mrs Adriane Fugh-Berman wrote a paper called "Why lunch matters: Assessing physicians' perceptions about industry relationships". There is no such thing as a free lunch and arguably this is exactly what Wikidata is offering to the bio-medical industry.

All the bio-medical papers find their home in Wikidata and there is no mechanism, there is nothing to indicate the many erroneous papers, there is nothing to indicate that specific substances have been banned from use as a medical substance. When Wikipedia is to use Wikidata for information it will be so bad.

Mr Martin Keller is a psychatrist whose reputation was for sale. "His" paper Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial has been thoroughly debunked.

At Wikidata there seems to be the notion that facts like this are an affront to its neutrality. It is why there is no mention on the item for Mr Keller; "significant event" "ghostwriting author" was removed.

The problem is that without sufficient debunking potential for ghostwriting authors, their products and their ill effect, there is no possibility to establish the veracity of the bio-medical facts that have been imported in Wikidata. It is vital to the integrity of the Wikidata project that the Mr Kellers of this world are seen for what they are: frauds.
Thanks,
      GerardM



#Wikimedia - I endorse having a #strategy as it is good to have one

2017-11-05T08:15:31.319+01:00

Having a strategy is great. There are objectives and there is an idea how to get there. As the Wikimedia Foundation formulates its strategy, it is complicated. Complicated by necessity because it involves so many interests, people who invested so much of themselves in their project(s), people who speak so many different languages, languages that define them, people with different backgrounds because they define them as well. The strategy must be complicated because it aims to reconcile all these people and the organisations that represent them.

When you are a Wikimedian, it helps when your vision coincides with the vision implicit in this big strategy. I was asked to present at the Wikmedia Nederland conference; I presented a historic view on information gathering and sharing. The presentation was given in English because it was the one common language in the room.

I love presentations but talking with people I love even more. I was asked for stategies behind the things that I do, the things I value. The Luc Hoffman award is an example. It does not have a Wikipedia article but the subject, the science is of real relevance in this time of climate change. The idea of associating links (blue, red and black)  is a non confrontational way to bring Wikidata value to Wikipedia. Adding all the USAmerican alumni from en.wp categories will allow us to keep up with what they hold and know about even more USAmerican alumni. There is method behind the madness.

Now that the Wikimedia strategy goes to the next phase; I hope for many user stories; stories explaining what we are going to do and for whom. I also hope that technical considerations will not prevent innovation and improvements. In the end that is not what a strategy is. It is the hope for the bright future we deserve in our Wikimedia movement.



#Wikipedia - Student or Athlete or both ?

2017-10-24T14:06:52.212+02:00

College football, soccer, basketball, whatever is a USA phenomenon where young people attent a college or a university on a sports scholarship.

Wikipedia has categories for the different sports and members of such teams are these categories are typically a subcategory for the alumni of a college or university.

For Wikidata the alumni are typically harvested from the specific alumni catalogs and as a consequence it is as if all the athletes did not have an education.

My question, how can we best associate these college/university categories with the alumni categories?
Thanks,
      GerardM



#Katherine - When the Cebuano issue is no longer about #Wikipedia

2017-10-22T10:31:39.604+02:00

Dear Katherine, I loved your presentation at the Berkman Klein Center for Internet and Society. It has much to think about and it is great that you answer the question you want to answer .

You address questions like "will we let external organisations use our data for their own purposes". My suggestion to you, us all, is why not use our own data for our own purposes.

The Cebuano Wikipedia is seen as problematic on many levels. It is one of the biggest Wikipedias in number of articles and one of the smallest in the size of its community. Like any Wikipedia, its articles are harvested for use in Wikidata and that brings us to several problems but more importantly in the light of your presentation, opportunities.

Problem: the data used the Cebuano articles are based is problematic
Opportunity: import the data in Wikidata first and first do some curation there.

Problem: the data is licensed under a CC-by-sa license and Wikidata is CC-0
Opportunity: collaborate with the copyright holder and ask their permission to include the data in Wikidata

Problem: when text is generated by a bot, the text when saved in an article is fixed
Opportunity: do not save it as an article but generate the text and maybe cache the text

Problem: other organisations use our data to generate information
Opportunity: we generate information in all the 300 languages where Wikipedia does not have an article

Problem: we have information that has no article in any language
Opportunity: we generate the text and maybe cache the text

Problem: Wikimedia officials indicated that issues like the Cebuano Wikipedia are not relevant
No opportunity; opportunities for all our projects are missed

Katherine, we already generate texts using bots, we already cache our data, we do it for English, we do it for Swedish, Cebuano. Why leave it for the companies of our world to generate text where there is already so much? We can do better, do the same and do it for all our languages as well.
Thanks,
      GerardM



#Wikidata - just an award winner: Mr Shuming Nie

2017-10-21T13:02:40.076+02:00


Mr Shuming Nie is the 2007 winner of the Heinrich Emanuel Merck Prize. As such he was notable for inclusion in Wikidata.

A Wikipedia stub article was created. The article makes it plain that Mr Nie was a serial awardee and when you google Mr Nie, you find for instance the picture you see above. Mr Nie is one of many award winners that are "waiting for the recognition" of a Wikipedia article. By having these award winners in Wikidata, it becomes more easy to find people like someone you care for waiting for an article.
Thanks,
     GerardM



#Wikidata - motivation; thank you #Magnus

2017-10-15T11:07:22.561+02:00

I added a Baratunde A. Cola to Wikidata because he won the Alan T. Waterman Award. This month a Wikipedia article was written and I wanted to add some data to the item.

I did not because functionality that is key to me was broken. A new property was added and all the work that I had done on categories no longer showed in Reasonator. There was no willingness to consider the consequential loss of functionality and the result was a dip in my motivation.

Wikidata is important to me and I asked Magnus if he would help out and change Reasonator. He did.

Now I have added information to Mr Cola based on his categories. It matters that a category like this one reflects all the people known to have played in the Vanderbilt Commodores football team.

The issue is that at Wikidata, we have lost sight of these collaborative aspects. Everybody does his own thing and we hardly consider why. It is why user stories are so important; they tell you why something is done and what the benefit is.  In the end without a benefit there is no reason to do it.
Thanks,
      GerardM



#Wikisource - the proof of the pudding

2017-10-12T13:55:52.408+02:00

A user story for Wikisource could be: As Wikisourcerers we transcribe and format books so that our public may read these books electronically.

The proof of the pudding is therefore in the people who actually read the finished books.  To future proof the effort of the Wikisourcerers, it is vital to know all the books that are ready for reading. It is vital to know this for books in any and all languages supported.

There are two issues:
  • The status of the books is not sufficiently maintained in all the Wikisources
  • There is no tool that advertises finished books
To come to a solution, existing information could be maintained in Wikidata for all Wikisources in a similar way as done for badges. With the information in Wikidata a queries can be formulated that shows the books in whatever language, by whatever author.

Currently there are Wikisources that do not register this information at all. This does not prevent us from making the necessary steps towards a queriable solution. After all adding missing badges at a later date only adds to the size of the pudding, not to the proof of the pudding.
Thanks,
     GerardM



#Wikipedia discovers #OpenLibrary

2017-10-10T07:55:04.814+02:00

On Facebook, Dumisani Ndubane posted his discovery of Open Library:
I just discovered that The Internet Archive has a book loan system, which gives me access up to 5 books for 14 days. So I have a library on my laptop!!! This is awesomest!!!
And it is. Anybody can borrow books from the Open Library (is is part of the Internet Archive). What Dumisani did not know at the time is that there are books in other languages to be found as well.

Dumisani found out by accident; he googled for an ebook called "Heart of darkness" by Joseph Conrad. What Dumisani did not know at the time is that the Open Library includes books in many languages. His next challenge: find the books in Xitsonga, and tell his fellow Wikipedians about it.
Thanks,
      GerardM



#Wikimedia - A user story for libraries

2017-10-04T13:36:20.201+02:00

The primary user story for libraries is something like: As a library we maintain a collection of publications so that the public may read them in the library or at home .

Whatever else is done, it is to serve this primary purpose. In the English Wikipedia you will find at the bottom for many authors a reference to WorldCat. WorldCat is to entice people to come to their library.

It does not work for me.

My library is in Almere and, I have stated in my profile in WorldCat that I live in Almere, I have indicated that my local library is my favourite. WorldCat indicates that the Peace Palace Library is nearby.. It isn't.

When it does not work for me, it does not work for other people reading Wikipedia articles and consequently it needs to be fixed. So what does it take to fix WorldCat for the Netherlands; for me. WorldCat is used for a wordwide public and all the libraries of the world may benefit when WorldCat gets some TLC.
Thanks,
     GerardM



#Wikipedia - A user story for WikipediaXL: an end to the Cebuano issue

2017-10-02T14:14:07.169+02:00

The user story for #Wikimedia is something like: As a Wikimedia community we share the sum of all knowledge so that all people have this available to them. 

As an achievable objective it sucks. The sum of all knowledge is not available to us either. To reflect this, the following is more realistic: As a Wikimedia community we share the sum of all knowledge available to us so that all people have this available to them.

When all people are to be served with the sum of all knowledge that is available to us, it is obvious that what we do serve depends very much on the language people are seeking knowledge in. What we offer is whatever a Wikipedia holds and this is often not nearly enough.

To counter the lack of information, bots add articles on subjects like "all the lakes in Finland". This information is not really helpful for people living in the Philipines but it does add to the sum of available information in Cebuano.

The process is as follows: an external database is selected. A script is created to build text and an infobox for each item in the database. This text is saved as an article in the Wikipedia. From the article information is harvested and it is included in Wikidata. One issue is that when the data is not "good enough", subsequent changes in Wikidata are not reflected in the Wikipedia article.

Turning the process around makes a key difference. An external database is selected. Selected data is merged into Wikidata. This data is used to generate only new article texts that are cached in all languages that have an applicable script. As the quality of the data in Wikidata improves, the cached articles improve.

With Wikipedia extended in this way, WikipediaXL, we become more adept at sharing the sum of our available knowledge. With caching enabled in this way, any language may benefit from all the data in Wikidata. It is considered important to consider the quality of new data. Data may come from a reputable source or from a source we collaborate with on the maintenance of the data. What is to be preferred is for another blogpost.



#Wikipedia - #Wikidata user stories

2017-09-30T17:27:03.422+02:00

User stories are important. They indicate why a certain functionality exists or the purpose of a project. A "user story" has a fixed format:
As a I would like to so that I .
One user story is: As a Wikipedia editor, I can link an article to articles in other language(s) so that a Wikipedia reader can find an article in a language he or she can read.

Another user story:  As a Wikidata editor, I can maintain statements on Wikidata items so that Wikipedia readers always have the latest information available to them.

The first user story has been a resounding success. It is why Wikidata was relevant from the start. The second is very much a work in process and it depends very much how the current state of affairs is evaluated. There are dependencies for the efforts of so many to have an effect;
  • Readers of a Wikipedia can only see the result when the information has been included in Wikidata
  • Wikipedia readers will only see the result when the editors of their Wikipedia allow them to see it
The first dependency is with Wikidata editors but the second dependency is outside of the influence of Wikidata editors. For this reason it makes sense to formulate a different user story: As a Wikidata editor I can maintain statements on Wikidata items so that Wikipedia editors can take the responsibility to inform their public.

To help these Wikipedia gatekeepers there is a need for tools that makes them aware of the information they do not provide.
Thanks,
      GerardM



#Wikimedia and its #BLP approach

2017-09-17T09:50:11.714+02:00


There is a huge controversy about the policies about the "Biographies of Living People". Central in all this is that there is no such policy at Wikidata. Many seasoned Wikipedians are of the opinion that using data in Wikipedia is a violation of its BLP policy as a consequence. At the same time there are seasoned Wikidatans who oppose a BLP policy similar to the one at Wikipedia. The problem is that Wikidata does need a BLP policy but it needs to be different for various reasons.

  • An item in Wikidata can be really rudimentary; Marian Latour, a Dutch author, was created because she won an award. This is allowed in Wikidata but the limited information is probably a violation of the English BLP policy. This information came from the Dutch Wikipedia
  • The initial data of Wikidata were the interwiki links. This was a huge improvement for the Wikipedias and there are still many items that have no statements. This is used as an argument not to accept information from Wikidata.
  • Wikidata data is retrieved from a Wikipedia, information like "who won an award". Given the BLP policy of that Wikipedia is should be faultless but it often is not due to disambiguation issues. 
The first issue refers to a red link on the Dutch Wikipedia. When the red link is associated with the Wikidata item, there will not be a new disambiguation issue when a different Marian Latour is introduced. Currently there is only one Marian Latour known to Wikidata.
The second issue is one where Wikidata statistics indicate that slowly but surely is adding statements. They also prove that there is still so much to do...
The third issue is the main one. When an article is linked to Wikidata, articles in other languages should link to the same item or to a red link. Solving these issues requires coexistence and preferably collaboration. 

What we need in a Wikipedia is the ability to link a blue or red link to a Wikidata item. Obviously changing links is either blatantly obvious like for Manuel Echeverria or it requires a source. Technically the necessary change in the MediaWiki software may be "opt in" so that only people who care about this approach to quality make use of it. 

As far as I am concerned, when some Wikipedians find fault elsewhere and do not reflect on this proposal and the improvements it brings them, that is fine. What is relevant is that this approach allows for the best Wikidata practices and at the same time improves the BLP quality in all Wikimedia projects.
Thanks,
       GerardM



The Manuel Echeverría "revenge"

2017-09-09T09:22:56.134+02:00

When there are mistakes in a Wikipedia, it follows that once information is copied from that Wikipedia these mistakes find their way into Wikidata. So Manuel Echeverria did not receive the Xavier Villaurrutia AwardManuel Echeverría did.

So the edit that made Mr Echeverria a recipient of the award was reverted. I fixed things by using the Spanish Wikipedia as a resource instead. The dates were added when people received the award and a few missing people in Wikidata are now known as well.

I cannot be bothered to fix the English Wikipedia. There is no structural solution at this time and as far as I am concerned, there is no interest in one that has been proposed.

There is one additional reason why a solution would be advantageous; reverting edits is a hostile act when edits are made with the best intentions. By actively linking red links and black links to Wikidata, such reversions will become unnecessary.

The problem is that Wikipedians need to understand a problem that as far as they are concerned is elsewhere, and is only caused by the lack of quality of their project. It is with grim satisfaction that I know it serves them well.
Thanks,
     GerardM



#Wikimedia - Where I make a stand / where I stand for

2017-09-02T14:31:39.391+02:00

I was told that my priorities are not the shared priorities of our movement; this by a pivotal person in the WMF. I consider this a personal affront and I will spell out what I stand for and where I make a stand. When you want to personally verify the veracity of my commitment; read my blog and check out my involvement. I have blogged for over 10 years and the basics/citations are all there to find. I consider my position very much in line with what our movement is there for.==Share in the sum of all knowledge==This is the overarching aim of our movement. At this time we are congratulating ourselves with what we have achieved so far. There is a lot to celebrate particularly for the English reading world.===Everything but English===Given that only 40% of the world population can read English, our successes need to be measured for what we do for all the people in the world. I do not care for good intentions, I care for what can be observed. Financially there is no break down available on the amount spend on English versus the amount spend on all the rest. This is imho a diversity issue as potent as the gender gap. All the arguments why "English first" are structurally no different from any other "my group first" arguments. Just compare the amounts given to US American chapters versus the Indian chapter. In addition you may or may not consider the cost of the software that is developed with English Wikipedia in mind.===Internationalisation and localisation===I have searched briefly for "internationalisation" in the 2030 strategy papers. Could not find it. It is however the bedrock of Wikipedia. It is vital for any and all of the individual features of MediaWiki.When you consider Wikimedia partners like the Internet Archive and their Open Library, we do not even consider how much we will to achieve when together we reach out to the other 60% as well. Our internationalisation platform is open to our open source partners and translatewiki.net is in my opinion a strategic resource.===Partners===The successes of our GLAM partnerships prove collaboration serves mutual interests. There are plans to improve Commons, a key part is the Wikidatification that will open up Commons, not only in English but also in any and all other languages. Where we could make more of a difference is help where our partners indicate what is relevant to them. We can show them the effect of the cooperation in any language. At this time what we show is limited to images. This is something we should expand on.====Internet Archive====The Internet Archive provides a vital service to our Wikipedias. Its Wayback Machine allows us to proof that references that used to be on the Internet existed. Effectively it is an import tool when the aim is to prevent misinformation. Its Open Library has two parts. The part I am interested in is making free e-books available to readers. We would do better when we collaborate just a bit more and help them with their internationalisation and localisation.====OCLC====The libraries of this world collaborate in the OCLC and share their links in one system; the Virtual International Authority File. In its WorldCat sytem, the idea is that people can find books in the library near to them. Thanks to the references to local libraries, it is always possible to know if a book, an author is known in whatever country. Important is for us to improve cooperation and the visibility of this collaboration for our readers and editors.===Bringing things together===I have helped bring data from Wikidata,[...]



#Wikidata - surge of new items

2017-08-27T10:43:31.258+02:00

Lately there has been a surge of new items coming into Wikidata. They must be quite good when you consider the number of statements. The items with no statements are mainly part of the original load, the Wikipedia articles, and their number is slowly but surely decreasing (1.35% the last month).

With more items in Wikidata, there is more data to support, to edit. As it is, limits are put on the amount of edits. This can be appreciated because of the current performance problems but it is obvious that as this upward trend continues, more people and more data will come to Wikidata to edit as well as to query.

There is plenty of data waiting in the wings to be added. The big challenge is promoting the data that is of use and will enable more collaboration both with people and with organisations.
Thanks,
      GerardM



#OpenLibrary - Charles Horn and its other volunteers

2017-08-26T08:48:05.621+02:00

There are several reasons why Open Library and Internet Archive deserve attention. They provide downloadable books in many language and their Wayback machine comes to the rescue when links in references in Wikipedia go stale. Have a look at the presentation from Wikimania 2017 (from11:46).The Internet Archive is officially one of the partners of the Wikimedia Foundation. When you ask who in the Wikimedia Foundation is the goto person for contacts with Internet Archive, there is no answer. It is as if there is no structure in contacts with our partners even when it plays dividends to collaborate in a more structured way. When you consider the "Coleman Boat" it is just as if the macro elements are totally missing and it is left for the micro elements to make the difference.Macro effects of collaboration with the Open Library would be:references are made to downloadable eBooks from Wikipedia - People read bookslocalisation are made at translatewiki.net - People read books in "other" languages books at Open Library are in Wikidata - links to eBooks are availableidentifiers are widely shared and widely curated -  work of volunteers has the biggest impactAt a micro level, collaboration is happening. Charles Horn, a volunteer at Open Library is a stellar example. Charles added identifiers to Wikidata and VIAF in the Open Library database. He provided us with a large file of redirects and was instrumental in removing multiple identifiers to Open Library for authors.  He recently produced a Wikidata query to find duplicates and the Wikidata community was made aware of this maintenance work. Many of the macro opportunities become possible when conditions at Open Library are met. One big issue is the need for disambiguation and de-duplication. This is not helped with the massive amounts of data involved and the lack of data on the individual author level. While individuals like Charles have an immense effect, it is in the collaboration on a macro level where even bigger differences can be made. Consider; many books include identifiers like an ISBN or a link to the Library of Congress. So it is possible to leverage a tool developed at the Wikimedia Foundation to retrieve associated meta data or to find associated data at the OCLC.It takes just a bit of friendly prodding from the macro people at the associated organisations, some reassurance that there is support for these efforts and there will be a lot of talent at the micro level making a big difference. Cooperation and coordination is what the organisations are to provide and we will share more of the knowledge that is available to all who come looking.Thanks,       GerardM[...]



#Wikidata - Martin Reints and {{Authority control}}

2017-08-20T13:24:11.863+02:00

Martin Reints received the Herman Gorter Award in 1993. There is a Wikipedia article about him and consequently he was known in Wikidata. There was no "authority control" information for Mr Reints in Wikidata yet and this was quickly remedied.

The most interesting part is that the VIAF registration for Mr Reints already included a link to Wikidata. Proof perfect that librarians are actively working on keeping their house in order. There was an Open Library entry for Mr Reints and the Dutch article had a link to the DBNL-website for Dutch language authors.

Open Library I found is very much about books. Their data on the books they have is great; identifiers like ISBN-10 or ISBN-13 and links to the online catalog of the Library of Congress. This makes a lookup at the OCLC for identifiers of all the authors easy and disambiguation becomes more effective.

Wikidata is very much about data. You can query Wikidata for all the winners of the Herman Gorter Award and it the results you can add the links to VIAF or to the Open Library. This ability to query makes all kinds of applications possible like: "what books written by authors who won the Nobel Prize are available in your library?"
Thanks,
      GerardM



#OpenLibrary and winners of the Herman Gorter Award

2017-08-19T12:58:50.076+02:00

If you want to know if the Open Library is of relevance in other languages, you have to do some research. I wanted to find out if there are publications by the authors who won the prestigious Herman Gorter Award?

This award was conferred from 1945 to 2002 often to multiple authors. The first author not known to Open Library is H. C. ten Berge. He received the Herman Gorter award in 1964. There were several authors where Wikidata did not have a link yet for Open Library.

Now consider this: what if we could query Wikidata for all the authors and their publications in Open Library? 

Just a little bit more metadata about books, publications is what we need.. It is not really a big deal, only a few million additional records..

Many if not most of the books at Open Library have links to authorities like the Library of Congress. This makes it possible to link these books through the OCLC to "your library system". It knows about authors and that is what makes it possible to use tools in stead of people to enrich Wikidata and open up all that is in the Open Library for all of us.
Thanks,
       GerardM