Subscribe: Planet Code4Lib
Added By: Feedage Forager Feedage Grade B rated
Language: English
code  community  data  information  libraries  library  lita  new  open data  open  technology  text  word  words   
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Planet Code4Lib

Planet Code4Lib

Planet Code4Lib -


Eric Lease Morgan: Creating a plain text version of a corpus with Tika

Thu, 26 Apr 2018 00:22:19 +0000

It is imperative to create plain text versions of corpus items. Text mining can not be done without plain text data. This means HTML files need to be rid of markup. It means PDF files need to have been “born digitally” or they need to have been processed with optical character recognition (OCR), and then the underlying text needs to be extracted. Word processor files need to converted to plain text, and the result saved accordingly. The days of plain o’ ASCII text files need to be forgotten. Instead, the reader needs to embrace Unicode, and whenever possible, make sure characters in the text files are encoded as UTF-8. With UTF-8 encoding, one gets all of the nice accent marks so foreign to United States English, but one also gets all of the pretty emoticons increasingly sprinkling our day-to-day digital communications. Moreover, the data needs to be as “clean” as possible. When it comes to OCR, do not fret too much. Given the large amounts of data the reader will process, “bad” OCR (OCR with less than 85% accuracy) can still be quite effective. Converting harvested data into plain text used to be laborious as well as painful, but then a Java application called Apache Tika came on the scene. [1] Tika comes in two flavors: application and server. The application version can take a single file as input, and it can output metadata as well as any underlying text. The application can also work in batch mode taking a directory as input and saving the results to a second directory. Tika’s server version is much more expressive, more powerful, and very HTTP-like, but it requires more “under the hood” knowledge to exploit to its fullest potential. For the sake of this workshop, versions of the Tika application and Tika server are included in the distribution, and they have been saved in the lib directory with the names tika-desktop.jar and tika-server.jar. The reader can run the desktop/GUI version of the Tika application by merely double-clicking on it. The result will be a dialog box. Drag a PDF (or just about any) file on to the window, and Tika will extract the underlying text. To use the command-line interface, something like this could be run to output the help text: $ java -jar ./lib/tika-desktop.jar --help > java -jar .\lib\tika-desktop.jar --help And then something like these commands to process a single file or a whole directory of files: $ java -jar ./lib/tika-desktop.jar -t $ java -jar ./lib/tika-desktop.jar -t -i -o > java -jar .\lib\tika-desktop.jar -t > java -jar .\lib\tika-desktop.jar -t -i -o Try transforming a few files individually as well as in batch. What does the output look like? To what degree is it readable? To what degree has the formatting been lost? Text mining does not take formatting into account, so there is no huge loss in this regard. Without some sort of scripting, the use of Tika to convert harvested data into plain text can still be tedious. Consequently, the whole of the workshop’s harvested data has been pre-processed with a set of Perl and bash scripts (which probably won’t work on Windows computers unless some sort of Linux shell has been installed): $ ./bin/ – runs Tika in server mode on TCP port 8080, and waits patiently for incoming connections $ ./bin/ – takes a file as input, sends it to the server, and returns the plain text while handling the HTTP magic in the middle $ ./bin/ – a front-end to the second script taking a file and directory name as input, transforming the file into plain text, and saving the result with the same name but in the given directory and with a .txt extension The entirety of the harvested data has been transformed into plain text for the purposes of this workshop. (“Well, almost all.”) The result has been saved in the folder/directory named “corpus”. Peruse the corpus directory. Compare & contrast its contents w[...]

Eric Lease Morgan: Identifying themes and clustering documents using MALLET

Thu, 26 Apr 2018 00:02:06 +0000

Topic modeling is an unsupervised machine learning process. It is used to create clusters (read “subsets”) of documents, and each cluster is characterized by sets of one or more words. Topic modeling is good at answering questions like, “If I were to describe this collection of documents in a single word, then what might that word be? How about two?” or make statements like, “Once I identify clusters of documents of interest, allow me to read/analyze those documents in greater detail.” Topic modeling can also be used for keyword (“subject”) assignment; topics can be identified and then documents can be indexed using those terms. In order for a topic modeling process to work, a set of documents first needs to be assembled. The topic modeler then, at the very least, takes an integer as input, which denotes the number of topics desired. All other possible inputs can be assumed, such as use of a stop word list or denoting the number of time the topic modeler ought to internally run before it “thinks” it has come the best conclusion.

MALLET is the grand daddy of topic modeling tools, and it supports other functions such as text classification and parsing. [1] It is essentially a set of Java-based libraries/modules designed to be incorporated into Java programs or executed from the command line.

A subset of MALLET’s functionality has been implemented in a program called topic-modeling-tool, and the tool bills itself as “A GUI for MALLET’s implementation of LDA.” [2] Topic-modeling-tool provides an easy way to read what possible themes exist in a set of documents or how the documents might be classified. It does this by creating topics, displaying the results, and saving the data used to create the results for future use. Here’s one way:

  1. Create a set of plain text files, and save them in a single directory.
  2. Run/launch topic-modeling-tool.
  3. Specify where the set of plain text files exist.
  4. Specify where the output will be saved.
  5. Denote the number of topics desired.
  6. Execute the command with “Learn Topics”.

The result will be a set of HTML, CSS, and CSV files saved in the output location. The “answer” can also be read in the tool’s console.

A more specific example is in order. Here’s how to answer the question, “If I were describe this corpus in a single word, then what might that one word be?”:

  1. Repeat Steps #1-#4, above.
  2. Specify a single topic to be calculated.
  3. Press “Optional Settings…”.
  4. Specify “1” as the number of topic words to print.
  5. Press okay.
  6. Execute the command with “Learn Topics”.

What one word can be used to describe your collection?

Iterate the modeling process by slowly increasing the number of desired topics and number of topic words. Personally, I find it interesting to implement a matrix of topics to words. For example, start with one topic and one word. Next, denote two topics with two words. Third, specify three topics with three words. Continue the process until the sets of words (“topics”) seem to make intuitive sense. After a while you may observe clear semantic distinctions between each topic as well as commonalities between each of the topic words. Distinctions and commonalities may include genders, places, names, themes, numbers, OCR “mistakes”, etc.


Eric Lease Morgan: Introduction to the NLTK

Wed, 25 Apr 2018 23:47:28 +0000

The venerable Python Natural Language Toolkit (NLTK) is well worth the time of anybody who wants to do text mining more programmatically. [0] For much of my career, Perl has been the language of choice when it came to processing text, but in the recent past it seems to have fallen out of favor. I really don’t know why. Maybe it is because so many other computer languages have some into existence in the past couple of decades: Java, PHP, Python, R, Ruby, Javascript, etc. Perl is more than capable of doing the necessary work. Perl is well-supported, and there are a myriad of supporting tools/libraries for interfacing with databases, indexers, TCP networks, data structures, etc. On the other hand, few people are being introduced to Perl; people are being introduced to Python and R instead. Consequently, the Perl community is shrinking, and the communities for other languages is growing. Writing something in a “dead” language is not very intelligent, but that may be over-stating the case. On the other hand, I’m not going to be able to communicate with very many people if I speak Latin and everybody else is speaking French, Spanish, or German. It behooves the reader to write software in a language apropos to the task as well as a langage used by many others. Python is a good choice for text mining and natural language processing. The Python NLTK provides functionality akin to much of what has been outlined in this workshop, but it goes much further. More specifically, it interaces with WordNet, a sort of thesaurus on steroids. It interfaces with MALLET, the Java-based classification & topic modeling tool. It is very well-supported and continues to be maintained. Moreover, Python is mature in & of itself. There are a host of Python “distributions/frameworks”. There are any number of supporting libraries/modules for interfacing with the Web, databases & indexes, the local file system, etc. Even more importantly for text mining (and natural language processing) techniques, Python is supported by a set of robust machine learning libraries/modules called scikit-learn. If the reader wants to write text mining or natural language processing applications, then Python is really the way to go. In the etc directory of this workshop’s distribution is a “Jupyter Notebook” named “An introduction to the NLTK.ipynb”. [1] Notebooks are sort of interactive Python interfaces. After installing Jupyter, the reader ought to be able to run the Notebook. This specific Notebook introduces the use of the NLTK. It walks you through the processes of reading a plain text file, parsing the file into words (“features”). Normalizing the words. Counting & tabulating the results. Graphically illustrating the results. Finding co-occurring words, words with similar meanings, and words in context. It also dabbles a bit into parts-of-speech and named entity extraction. The heart of the Notebook’s code follows. Given a sane Python intallation, one can run this proram by saving it with a name like, saving a file named walden.txt in the same directory, changing to the given directory, and then running the following command: python The result ought to be a number of textual outputs in the terminal window as well as a few graphics. Errors may occur, probably because other Python libraries/modules have not been installed. Follow the error messages’ instructions, and try again. Remember, “Your milage may vary.” # configure; using an absolute path, define the location of a plain text file for analysis FILE = 'walden.txt' # import / require the use of the Toolkit from nltk import * # slurp up the given file; display the result handle = open( FILE, 'r') data = print( data ) # tokenize the data into features (words); display them features = word_tokenize( data ) print( features ) # normalize the features to lower case and exclude punctuation features = [ feature for feature in features if feature.isalph[...]

District Dispatch: May CopyTalk: Creative Commons Certificate

Wed, 25 Apr 2018 18:08:23 +0000

The newly developed Creative Commons (CC) Certificate program was created for librarians, educators, and government in response to the continued growth in the use of CC licenses globally and the corresponding need to help people acquire Commons knowledge and skills. This session will review the CC Certificates program including content, feedback from the beta, building a train-the-trainer program for CC country chapters (and other partners) who want to offer the CC Certificates and certify others, and remixing the CC Certificates into University courses to train the next generation of librarians and educators.

Our speaker, Cable Green works with the global open education community to influence open licensing, open content, and open policies to significantly improve access to quality, affordable, education and research resources so everyone in the world can attain all the education they desire. His career is dedicated to increasing access to educational opportunities for everyone around the world. He’s a leading advocate for open licensing policies that ensure publicly funded education materials are freely and openly available to the public that has paid for them. Cable has twenty years of academic technology, online learning, and open education experience and helped establish the Open Course Library. Cable holds a Ph.D. in educational psychology from Ohio State University and enjoys motorcycling and playing in the mountains with his family. He lives in Olympia with his wife and two boys.

Join us on Thursday, May 3 at 2 p.m. Eastern / 11 a.m. Pacific for our hour-long free webinar to learn more about the program everyone has been talking about! Go to and sign in as a guest.

This program is brought to you by the Washington Office copyright education subcommittee.

Did you miss a CopyTalk? Check out our CopyTalk webinar archive!

The post May CopyTalk: Creative Commons Certificate appeared first on District Dispatch.

Eric Lease Morgan: Using Voyant Tools to do some “distant reading”

Wed, 25 Apr 2018 02:44:53 +0000

Voyant Tools is often the first go-to tool used by either: 1) new students of text mining and the digital humanities, or 2) people who know what kind of visualization they need/want. [1] Voyant Tools is also one of the longest supported tools described in this bootcamp.

As stated the Tool’s documentation: “Voyant Tools is a web-based text reading and analysis environment. It is a scholarly project that is designed to facilitate reading and interpretive practices for digital humanities students and scholars as well as for the general public.” To that end it offers a myriad of visualizations and tabular reports characterizing a given text or texts. Voyant Tools works quite well, but like most things, the best use comes with practice, a knowledge of the interface, and an understanding of what the reader wants to express. To all these ends, Voyant Tools counts & tabulates the frequencies of words, plots the results in a number of useful ways, supports topic modeling, and the comparison documents across a corpus. Examples include but are not limited to: word clouds, dispersion plots, networked analysis, “stream graphs”, etc.

dispersion chart
network diagram
“stream” chart
word cloud
topic modeling

Voyant Tools’ initial interface consists of six panes. Each pain encloses a feature/function of Voyant. In the author’s experience, Voyant Tools’ is better experienced by first expanding one of the panes to a new window (“Export a URL”), and then deliberately selecting one of the tools from the “window” icon in the upper left-hand corner. There will then be displayed a set of about two dozen tools for use against a document or corpus.

initial layout
focused layout

Using Voyant Tools the reader can easily ask and answer the following sorts of questions:

  • What words or phrases appear frequently in this text?
  • How do those words trend throughout the given text?
  • What words are used in context with a given word?
  • If the text were divided into T topics, then what might those topics be?
  • Visually speaking, how do given texts or sets of text cluster together?

After a more thorough examination of the reader’s corpus, and after making the implicit more explicit, Voyant Tools can be more informative. Randomly clicking through its interface is usually daunting to the novice. While Voyant Tools is easy to use, it requires a combination of text mining knoweldge and practice in order to be used effectively. Only then will useful “distant” reading be done.

[1] Voyant Tools –

Library Tech Talk (U of Michigan): Pairing-with-Greg Fridays: Experiments in Paired Software Development

Wed, 25 Apr 2018 00:00:00 +0000


On most Fridays I engage in an ongoing experiment in virtual pairing using test driven development (TDD).

DuraSpace News: Register Today: Islandora Camps

Wed, 25 Apr 2018 00:00:00 +0000

DuraSpace News: Hyku News: Service Providers Collaborate to Pursue Hyku Development Goals

Wed, 25 Apr 2018 00:00:00 +0000

Service providers interested in offering hosted Hyku repository services have begun meeting regularly to collaborate on development goals.

LITA: 2018 LITA Library Technology Forum Call for Proposals – Reminder

Tue, 24 Apr 2018 17:26:59 +0000

The Library and Information Technology Association seeks proposals for the 21st Annual

2018 LITA Library Technology Forum
Minneapolis, MN
November 8-10, 2018

Submit your Proposals by May 1, 2018.

The theme for the 2018 LITA Library Technology Forum is:

Building & Leading.

What are you most passionate about in librarianship and information technology? This conference is your chance to share how your passions are building the future, and to lead others by illumination and inspiration. We invite you to rethink your assumptions on presentations and programming. Please see this previous posting for additional details about the Forum, submitting a proposal and being a presenter.

Remember the Submission Deadline is Tuesday, May 1, 2018.

Proposals may cover projects, plans, ideas, or recent discoveries. We accept proposals on any aspect of library and information technology. The committee particularly invites submissions from first time presenters, library school students, and individuals from diverse backgrounds. We deliberately seek and strongly encourage submissions from underrepresented groups, such as women, people of color, the LGBTQA+ community, and people with disabilities. We also strongly encourage submissions from public, school, and special libraries.

There are two ways of spreading light: to be the candle or the mirror that reflects it.
Edith Wharton

Submit your proposal at this site

More information about LITA is available from the LITA website, Facebook and Twitter.

Questions or Comments?

For all other questions or comments related to the LITA Library Technology Forum, contact LITA at (312) 280-4268 or Mark Beatty,

Terry Reese: Linked Data Updates in MarcEdit 7

Tue, 24 Apr 2018 16:35:59 +0000

If you use MarcEdit’s linked data tooling (or are interested in using it) — a couple notes about updates planned for this week.  I’ve been making some targeted changes to improve performance, expanding the number of defined collections, and have been looking at the rules file to see if I could shift it from XML to YAML to make it easier to use with other languages (because why should we all have to recreate this work).  Here’s the changes that will roll out this week (likely across two updates):

  1. Thread Pool is increasing from 3 – 8.  This will improve speed of processing
  2. I’ve changed how the tool interacts with VIAF.  VIAF processing now happens via json not XML.
  3. I’ve also updated VIAF so that the tool can resolve names a little better.  Prior, unless the name was marked as the official main entry – it wouldn’t resolve.  I’m using a different index so that any version of a name, from any language will generate the URI.
  4. I’ve also added profiles for every collection found in the PCC’s guide to creating URIs document.  If the service has an API and can be represented either as a defined index (in a $2) or as a local index, it will be accommodated.   This means that items on that list that don’t have APIs or honestly have no business being put in MARC records (because of lack of API, open data, or license) won’t be included.  At some point, I’d like to update the PCC document so that developers have the information they to code reconciliation processes.
  5. I’ll be removing the VIAF UI elements in the Linked Data tool as these configurations should be handled via the rules file.
  6. Work URIs are moving from a UI element that is hard coded to a rules file element (for flexibility)
  7. Right now, the rules file is coded in XML — I’m experimenting with shifting the format for MarcEdit 7 to support either xml or YAML.  I like the idea of using YAML, as it makes it easier for other languages to utilize.
  8. Rules file is going up on GitHub.  I’d like to encourage folks to not have to recreate this work over and over again.


These changes will come out in two releases.  The first will be tonight (4/24), the second, likely on Thursday or Friday.


Eric Lease Morgan: Project English: An Index to English/American literature spanning six centuries

Tue, 24 Apr 2018 15:48:13 +0000

I have commenced upon a project to build an index and set of accompanying services rooted in English/American literature spanning the 15th to 20th centuries. For the lack of a something better, I call it Project English. This blog posting describes Project English in greater detail. Goals & scope The goals of the Project include but are not limited to: provide enhanced collections & services to the University of Notre Dame community push the boundaries of librarianship To accomplish these goals I have acquired a subset of three distinct and authoritative collections of English/American literature: EEBO – Early English Books Online which has its roots in venerable Short-Title Catalogue of English Books ECCO – Eighteenth Century Collection Online, which is an extension of the Catalogue Sabin – Bibliotheca Americana: A Dictionary of Books Relating to America from Its Discovery to the Present Time originated by Joseph Sabin More specifically, the retired and emeritus English Studies Librarian, Laura Fuderer purchased hard drives containing the full text of the aforementioned collections. Each item in the collection is manifested as an XML file and a set of JPEG images (digital scans of the original materials). The author identified the hard drives, copied some of the XML files, and began the Project. To date, the collection includes: 56 thousand titles 7.6 million pages 2.3 billion words At the present time, the whole thing consumes 184 GB of disk space where approximately 1/3 of it is XML files, 1/3 of it is HTML files transformed from the XML, and 1/3 is plain text files transformed from the XML. At the present time, there are no image nor PDF files in the collection. On average, each item in the collection is approximately 135 pages (or 46,000 words) long. As of right now, each sub-collection is equally represented. The vast majority of the collection is in English, but other languages are included. Most of the content was published in London. The distribution of centuries is beginning to appear balanced, but determining the century of publication is complicated by the fact the metadata’s date values are not expressed as integers. The following charts & graphs illustrate all of these facts. Access & services By default, the collection is accessible via freetext/fielded/faceted searching. Given an EBBO, ECCO, or Sabin identifier, the collection is also accessible via known-item browse. (Think “call number”.) Search results can optionally be viewed and sorted using a tabled interface. (Think “spreadsheet”.) The reader has full access to: the original XML data – hard to read but rich in metadata rudimentary HTML – transformed from the original XML and a bit easier to read plain text – intended for computational analysis Search results and its associated metadata can also be downloaded en masse. This enables the reader to do offline analysis such as text mining, concordancing, parts-of-speech extraction, or topic modeling. Some of these things are currently implemented inline, including: listing the frequency of unigrams, bigrams, and trigrams listing the frequency of noun phrases, the subjects & objects of sentences For example, the reader can first create a set of one or more items of interest. They can then do some “distant” or “scalable” reading against the result. In its current state, Project English enables the reader to answer questions like: What is this particular item about? To what degree does this item mention the words God, man, truth, or beauty? As Project English matures, it will enable the reader to answer additional questions, such as: What actions take place in a given corpus? How are things described? If one were to divide the collection into T[...]

Eric Hellman: Inter Partes Review is Improving the Patent System

Tue, 24 Apr 2018 14:31:42 +0000

Today (Monday, November 27), the Supreme Court is hearing a case, Oil States Energy Services, LLC v. Greene’s Energy Group, LLC, that seeks to end a newish  procedure called inter partes review (IPR). The arguments in Oil States will likely focus on arcane constitutional principles and crusty precedents from the Privy Council of England; go read the SCOTUSblog overview if that sort of thing interests you. Whatever the arguments, if the Court decides against IPR proceedings, it will be a big win for patent trolls, so it's worth understanding what these proceedings are and how they are changing the patent system. I've testified as an expert witness in some IPR proceedings, so I've had a front row seat for this battle for technology and innovation.A bit of background: the inter partes review was introduced by the "America Invents Act" of 2011,  which was the first major update of the US patent system since the dawn of the internet. To understand how it works, you first have to understand some of the existing patent system's perverse incentives.When an inventor brings an idea to a patent attorney, the attorney will draft a set of "claims" describing the invention. The claims are worded as broadly as possible, often using incomprehensable language. If the invention was a clever shelving system for color-coded magazines, the invention might be titled "System and apparatus for optical wavelength keyed information retrieval". This makes it difficult for the patent examiner to find "prior art" that would render the idea unpatentable. The broad language is designed to prevent a copycat from evading the core patent claims via trivial modifications.The examination proceeds like this: The patent examiner typically rejects the broadest claims, citing some prior art. The inventor's attorney then narrows the patent claims to exclude prior art cited by the examiner, and the process repeats itself until the patent office runs out of objections. The inventor ends up with a patent, the attorney runs up the billable hours, and the examiner has whittled the patent down to something reasonable.As technology has become more complicated and the number of patents has increased, this examination process breaks down. Patents with very broad claims slip through, often because the addition of the internet means that prior art was either un-patented or unrecognized because of obsolete terminology. These bad patents are bought up by "non-practicing entities" or "patent trolls" who extort royalty payments from companies unwilling or unable to challenge the patents. The old system for challenging patents didn't allow the challengers to participate in the reexamination. So the patent system needed a better way to correct the inevitable mistakes in patent issuance.In an inter partes review, the challenger participates in the challenge. The first step in drafting a petition is proposing a "claim construction". For example. if the patent claims "an alphanumeric database key allowing the retrieval of information-package subject indications", the challenger might "construct" the claim as "a call number in a library catalog", and point out that call numbers in library catalogs predated the patent by several decades. The patent owner might respond that the patent was never meant to cover call numbers in library catalog. (Ironically,  in an infringement suit, the same patent owner might have pointed to the broad language of the claim asserting that of course the patent applies to call numbers in library catalogs!) The administrative judge would then have the option of accepting the challenger's construction and open the claim to invalidation, or accepting the patent owner's construction, and letting the patent stand (but with the patent owner having agreed to a[...]

Lucidworks: How Amazon Can Fix Its Recommendations

Tue, 24 Apr 2018 13:00:52 +0000

Amazon is the world’s most successful online retailer and their growth shows no signs of stopping. Many companies look to Amazon for best practices for ecommerce and online retail. However, not everything they do is right. One of their most lauded features is actually a bit weak: their recommendations. For example, this Twitter user’s post received over 404k likes: Dear Amazon, I bought a toilet seat because I needed one. Necessity, not desire. I do not collect them. I am not a toilet seat addict. No matter how temptingly you email me, I’m not going to think, oh go on then, just one more toilet seat, I’ll treat myself. — Jac Rayner (@GirlFromBlupo) April 6, 2018 The Problem Amazon’s recommendations do one thing very well, which is that they remind you of that thing you looked at but didn’t buy. So you get notifications on your phone. You get notifications on Facebook. “Hey didn’t you mean to buy that toilet seat the other day? Here are some you might buy.” However, you already bought a toilet seat. You’re not a toilet seat enthusiast. Most people who buy a toilet seat buy one or two and probably the same style or brand, all in one purchase. Amazon probably has customer data that would tell them that, but they’re using a big hammer approach. The Fix The simplest way to fix this might be to simply set a time interval for the recommendations so eventually they time out and aren’t shown. It looks like Amazon does this already or replaces stale recommendations with something you’ve looked at more recently. However, this approach won’t work for bigger purchases where shoppers tend to take their time deciding before buying. Amazon does recommend accessories for recent purchases so we know they capture purchase signals and recommend things based on them. My son recently bought a Nintendo Switch. Since then, Amazon has been recommending cases and other accessories. These add-ons are a great use of recommendations. Another potential solution is to not recommend things that are similar to items a customer has already purchased. However, this doesn’t work for perishable items that we buy regularly. In fact, they might want to encourage you to subscribe to those. Amazon makes this delineation when it recommends their Dash buttons for these sorts of consumer goods. The best thing Amazon could do to fix its recommendations is to do a better job of customer profiling. If customers who buy a toilet seat don’t repeat this action in the next few weeks, then let’s not show them toilet seats again. I just bought a phone grip, why are you recommending a phone grip?   On the other hand, customers who regularly buy shoes might buy them more frequently if they’re shown good quality recommendations. You can segment and identify those customers algorithmically, essentially by clustering them together based on their signal patterns. Yes, they recently bought shoes, but let’s keep offering them shoes because they are indeed shoe enthusiasts. Or people who buy old vinyl records, sure keep it up, they buy vinyl whenever they can get it. TL;DR In short, Amazon should use purchase signals with a machine learning technique called “clustering” to segment customers in order to determine if they are likely to buy something similar again. If not, don’t recommend that item again. If similar customers bought similar items more than once, then go ahead and recommend it again. It goes without saying that we have tools and techniques to do all of this in Lucidworks Fusion. Next Steps Omnichannel Retail Fusion 4 Overview Contact us The post How Amazon Can Fix Its Recommendations appeared first on Lucidworks.[...]

District Dispatch: Ready to Code library teaches Hacker Club CS and mentoring skills

Tue, 24 Apr 2018 12:15:36 +0000

Video host Cathleen Clifford sits down with Groton (Conn.) Public Library Librarian Emily Sheehan and Teen Services Librarian Jessa Franco to discuss how they are using the Libraries Ready to Code grant to teach coding skills to teens in their Hacker Club, who then teach those skills to younger students. See a demonstration of the robots and micro:bits used in the Hacker Club.

This post is the third in a series by Libraries Ready to Code cohort participants, who will release their beta toolkit at ALA’s 2018 Annual Conference.

The post Ready to Code library teaches Hacker Club CS and mentoring skills appeared first on District Dispatch.

Mark E. Phillips: Introducing Sampling and New Algorithms to the Clustering Dashboard

Tue, 24 Apr 2018 12:09:25 +0000

One of the things that we were excited about when we adding the Clustering Dashboard to the UNT Libraries’ Edit system was the ability to experiment with new algorithms for grouping or clustering metadata values.  I gave a rundown of the Cluster Dashboard in a previous blog post  This post is going to walk through some of things that we’ve been doing to try and bring new data views to the metadata we are managing here at the UNT Libraries. The need to sample I’m going to talk a bit about the need to sample values first and then get to the algorithms that make use of it in a bit. When we first developed the Cluster Dashboard we were working with a process that would take all of the values of a selected metadata element, convert those values into a hash value of some sort and then and identify where there were more than one value that produces the same hash.  We were only interested in the instances that contained multiple values that had the same hash.  While there were a large number of clusters for some of the elements, each cluster had a small number of values.  I think the biggest cluster I’ve seen in the system had 14 values.  This is easy to display to the user in the dashboard so that’s what we did. Moving forward we wanted to make use of some algorithms that would result in hundreds, thousands, and even tens of thousands of values per cluster.  An example of this is trying to cluster on the length of a field.  In our dataset there are  41,016 different creator values that are twelve characters in length.  If we tried to display all of that to the user we would quickly blow up the browser for the user which is never any fun. What we have found is that there are some algorithms we want to use that will always return all of the values and not only when there are multiple values that share a common hash.  For these situations we want to be proactive and sample the cluster members so that we don’t overwhelm the users interface. Sampling Options in Cluster Dashboard You can see in the screenshot above that there are a few different ways that you can sample the values of a cluster. Random 100 First 100 Alphabetically Last 100 Alphabetically 100 Most Frequent 100 Least Frequent This sampling allows us to provide some new types of algorithms but still keep the system pretty responsive.  So far we’ve found this works because when you are using these cluster algorithms that return so many value you generally aren’t interested in the clusters that are the giant clusters. You are typically looking for anomalies that show up in smaller clusters, like really long or really short values for a field. Cluster Options in Dashboard showing sampled and non-sampled clustering algorithms. We divided the algorithm selection dropdown into two parts to try and show the user the algorithms that will be sampled and the ones that don’t require sampling.  The option to select a sample method will only show up when it is required by the algorithm selected. New Algorithms As I mentioned briefly above we’ve added a new set of algorithms to the Cluster Dashboard.  These algorithms have been implemented to find anomalies in the data that are a bit hard to find other ways.  First on the list is the Length algorithm.  This algorithm uses the number of characters or length of the value as the clustering key.  Generally the very short and the very long values are the ones that we are interested in. I’ll show some screenshots of what this reveals about our Subject element.  I always feel like I should make some sort of defense of our metadata when I show these screenshots but I have a feeling that anyone actually reading this will know[...]

Open Knowledge Foundation: Open Research in the Philippines: The Lessons and Challenges

Tue, 24 Apr 2018 07:00:46 +0000

Authors: Czarina Medina-Guce and Marco Angelo S. Zaplan This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Open Research Data theme. Pioneering discussions on open research in a country where data management is still in the works can be rewarding yet challenging. In celebration of global Open Data Day,  the Institute for Leadership, Empowerment, and Democracy (iLEAD) and Datos.PH initiated small group discussions on March 3, 2018. The organizations, while taking different tracks, have fielded the same question, how can we make data and literature more open for the research community? Gathering over twenty representatives from the academe, government agencies, civil society organizations, and research institutions, iLEAD embarked on a stocktaking exercise to assess the current research landscape in the Philippines. Datos.PH, on the other hand, organized a data hackathon with researchers and students, with the aim of making national datasets more disaggregated and gendered to enable analysis of datasets at the regional level. The iLEAD Team with the participants of the Open Data Day: Roundtable on Open Research in the Philippines last March 3, 2018 in Quezon City, Philippines Differences and Similarities Both events steered towards the goal of widening the access of the citizens, knowledge producers, advocates, and other infomediaries to data, research materials, and literature. For this, two approaches were used. Datos.PH’s hackathon involved time running and analyzing datasets while iLEAD’s event involved discussions with resource speakers from government and university libraries. Datos.PH’s hackathon brought together a small and focused group of technical data users, in this case, Statistics major students, to crunch data, disaggregate national datasets, and bring out gender data analysis into the open. The goal of each session was to disaggregate datasets by region and sex of the respondents. Once disaggregated, breakout groups presented initial statistical analysis of disaggregated datasets. iLEAD’s Roundtable Discussion engaged data users and suppliers to delve into the opportunities and barriers on open research. While the two initiatives produced different outputs, both have concluded that the current data landscape is still a long stretch from fully reaching various  stakeholders. A student crunching data during Datos.PH’s ODD event in Quezon City, Philippines Lessons Learned iLEAD was able to surface issues and concerns in opening up research from its initiated exercise. While there are significant strides in opening government data from the previous years, there are still challenges in making the programs genuinely usable and relevant for different publics. On the side of the government, the biggest gap still lies on the issue of legal frameworks in information sharing and accessing such as the long-standing contentions on the country’s Data Privacy Act and the absence of a Freedom of Information (FOI) law that will expand the scope of government information disclosure to subnational levels and other branches of the government. There is also low use of data made available for the public, too, which suggests a disconnect between the data that are being disclosed[...]

Harvard Library Innovation Lab: LIL Takes Toronto: the Creative Commons Summit 2018

Tue, 24 Apr 2018 00:00:00 +0000

The Creative Commons Conference in Toronto was wonderful this past weekend! It was a pleasure to meet the mix of artists, educators, civil servants, policymakers, journalists, and copyright agents (and more) who were there.

Talks touched on everything from how the Toronto Public Library manages their digital collections and engages their local cultural and tech communities with collections, to feminist theory and open access, to the state of copyright/open access worldwide.

The range of stakeholders and interested parties involved with open access was greater than I realized. While I'm familiar with libraries and academics being interested in OA and OER, the number of government policymakers and artists who were there to learn and discuss was heartening.

Until next year, Creative Commons! And thank you, Ontario! -Brett and Casey


LITA: Join Blake Carver for an IT Security LITA webinar

Mon, 23 Apr 2018 19:24:44 +0000

Stay Safe From Ransomware, Hackers & Snoops by working on your IT Security


Blake Carver

IT Security and Privacy in Libraries
Tuesday May 1, 2018, 2:00 – 3:30 pm
Presenter: Blake Carver

IT Security starts with some basics like using good passwords, keeping everything updated and following other basic precautions. Understanding the reasons behind these rules is critical. Who are the bad guys? What tools are they using? Why are we all targets? This talk will cover how to stay safe at the library and at home.

Register Online, page arranged by session date.

Also consider:

The Privacy in Libraries LITA webinar series of 3 more webinars, continues with:

Adopting Encryption Technologies
Wednesday April 25, 2018, Noon – 1:30 pm Central Time
Presenter: Matt Beckstrom

Register now to get the full series discounts or for any single series webinar.

Discover additional upcoming LITA webinars and web courses

Questions or Comments?

For all other questions or comments related to the webinars, contact LITA at (312) 280-4268 or Mark Beatty,

District Dispatch: NCWIT AspireIT Funding Opening May 1

Mon, 23 Apr 2018 16:30:57 +0000

This is a guest post from Jennifer Manning, Program Director of AspireIT Partnerships at the National Center for Women & Information Technology, which works with the ALA Washington Office’s Libraries Ready to Code initiative and NCWIT AspireIT to connect young women program leaders to public libraries to design and implement coding programs for K-12 girls in an exciting pilot project. Girls and women in the U.S. are avid users of technology, but they are significantly underrepresented in its creation ( National Center for Women & Information Technology (NCWIT) AspireIT can help libraries inspire more girls to become technology innovators. AspireIT connects high school and college-aged members of our Aspirations in Computing community with K-12 girls interested in computing. Using a near-peer model, AspireIT Leaders teach younger girls fundamentals in programming and computational thinking in fun, creative environments that are supported by Partner Organizations from the NCWIT community. The relationship between the AspireIT Leaders and their AspireIT Partner Organizations fosters mentoring with technical professionals, increases young women’s confidence in their computing abilities, and develops valuable leadership skills. To date, NCWIT has gifted more than $800,000 to 300 programs, providing an estimated 240,000 instruction hours to nearly 8,000 girls in 40 states, the District of Columbia, and the U.S. Virgin Islands since 2013. since 2013. AspireIT aims to engage more than 10,000 girls by 2018. The AspireIT Partner Organization role connects you with AspireIT Leaders from our Aspirations in Computing community to share their passion by facilitating K-12 computing education in your local community, inspire future innovators, and give back. Not to mention we will provide non-profit organizations with up to $3,000 in support to run these programs! We are in the process of opening our next round of AspireIT funding, for programs occurring between October 15, 2018 and June 14, 2019. Matching opens on May 1, 2018 and applications are due on July 1, 2018. Please: Fill out a new interest request so potential AspireIT Leaders can connect with you. Sign up as a potential AspireIT Partner Organization and share this opportunity with others throughout your network. Join us on May 23, 2018, at 12 p.m. PT/3 p.m. ET for an informational webinar: Questions? Email the team at The post NCWIT AspireIT Funding Opening May 1 appeared first on District Dispatch.[...]

Open Knowledge Foundation: Celebrating Open Data Day 2018 in Nigeria

Mon, 23 Apr 2018 14:25:40 +0000

This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Follow the Money theme. The concept of open data is growing rapidly across borders and Nigeria isn’t left out of this budding movement. Paradigm Leadership Support Initiative (PLSI) and OrderPaper Nigeria joined the global open data community to celebrate Open Data Day 2018 and further contributed to the discourse on why certain data should be publicly available in both human and machine-readable formats and accessible without any constraint whatsoever. PLSI’s local event which held at LPI_Hub located within University of Ibadan – Nigeria’s premiere University focused on promoting use of open data in tracking audited funds for developmental projects in Nigerian local communities to foster public accountability and improved service delivery. Likewise, OrderPaper which had developed a Mobile App “ConsTrack” to track constituency projects equally convened a townhall to celebrate the day. Its event was however targeted at training community youths and raising them to become FollowtheFunds Grassroots Champions (FGCs) to track, monitor and report on constituency projects undertaken by members of the National Assembly representing the Federal Capital Territory (FCT), Abuja. Who Attended? PLSI assembled various stakeholders in the open data community including data analysts, developers, creative artists, University students, Corps members and other lovers of data. 26 participants attended the event with 56% being males and 44% females. PLSI’s mentor and partner organization – BudgIT was equally represented at the event by two members of its research team – Olaniyi Olaleye and Thaddeus Jolaiyemi. Olaniyi delivered a stunning presentation on “Budget Access – Contracting and Audit” to take participants through the transit mechanism of data from budget to contracting and audit. PLSI’s Executive Director – Olusegun Elemo equally presented on the “Concept of Open Data” as well as a “Walk-Through session on Citizen Participatory Audit”. Our #OpenDataDay 2018 event is ongoing and @lpi_hub is filled with data analysts, developers, creative artists, activists and lovers of data. #ODD18 — PLSI (@PLSInitiative) March 3, 2018 Also, OrderPaper had 14 participants drawn from various area councils that make up the FCT who were trained on the use of data and technology to interrogate constituency projects in a bid to ensure inclusiveness, transparency and accountability. It is instructive that before the event, 78.5% of the participants rated government presence (generally) in terms of infrastructure and service delivery in their respective communities below average. Specific to constituency projects, many of the participants said implementation was “abysmal” as several communities like Igu in Bwari Area council was revealed to be without a good road. Participants at the OrderPaper Nigeria Open Data Day 2018 Breakout Session PLSI organized a datathon exercise for participants to relate directly with audit data of the Federal Government of Nigeria. Three groups worked to analyze an[...]

Eric Lease Morgan: Using a concordance (AntConc) to facilitate searching keywords in context

Mon, 23 Apr 2018 11:53:24 +0000

A concordance is one of the oldest of text mining tools dating back to at least the 13th century when they were used to analyze and “read” religious texts. Stated in modern-day terms, concordances are key-word-in-context (KWIC) search engines. Given a text and a query, concordances search for the query in the text, and return both the query as well as the words surrounding the query. For example, a query for the word “pond” in a book called Walden may return something like the following: 1. the shore of Walden Pond, in Concord, Massachuset 2. e in going to Walden Pond was not to live cheaply 3. thought that Walden Pond would be a good place fo 4. retires to solitary ponds to spend it. Thus also 5. the woods by Walden Pond, nearest to where I inte 6. I looked out on the pond, and a small open field 7. g up. The ice in the pond was not yet dissolved, t 8. e whole to soak in a pond-hole in order to swell t 9. oping about over the pond and cackling as if lost, 10. nd removed it to the pond-side by small cartloads, 11. up the hill from the pond in my arms. I built the The use of a concordance enables the reader to learn the frequency of the given query as well as how it is used within a text (or corpus). Digital concordances offer a wide range of additional features. For example, queries can be phrases or regular expressions. Search results and be sorted by the words on the left or on the right of the query. Queries can be clustered by the proximity of their surrounding words, and the results can be sorted accordingly. Queries and their nearby terms can be scored not only by their frequencies but also by the probability of their existence. Concordances can calculate the postion of a query i a text and illustrate the result in the form of a dispersion plot or histogram. AntConc is a free, cross-platform concordance program that does all of the things listed above, as well as a few others. [1] The interface is not as polished as some other desktop applications, and sometimes the usability can be frustrating. On the other hand, given practice, the use of AntConc can be quite illuminating. After downloading and running AntConc, give these tasks a whirl: use the File menu to open a single file use the Word List tab to list token (word) frequencies use the Settings/Tool Preferences/Word List Category to denote a set of stop words use the Word List tab to regenerate word frequencies select a word of interest from the frequency list to display the KWIC; sort the result use the Concordance Plot tab to display the dispersion plot select the Collocates tab to see what words are near the selected word sort the collocates by frequency and/or word; use the result to view the concordance The use of a concordance is often done just after the creation of a corpus. (Remember, a corpus can include one or more text files.) But the use of a concordance is much more fruitful and illuminating if the features of a corpus are previously made explicit. Concordances know nothing about parts-of-speech nor grammer. Thus they have little information about the words they are analyzing. To a concordance, every word is merely a token — the tiniest bit of data. Whereas features are more akin to information because they have value. It is better to be aware of the information at your disposal as opposed to simple data. Do not rush to the use of a concordance before you have some information at hand. [1] AntConc –[...]

District Dispatch: Net neutrality protections still in place (for now); ALA releases new FAQ

Mon, 23 Apr 2018 01:08:08 +0000

As many may recall, the FCC published its new net neutrality rules in the Federal Register in late February. This publication usually sets the clock ticking for the 60 days from which new regulations will take effect. In this case, however, the FCC must first get approval from the Office of Management and Budget (OMB), then publish a notice in the Federal Register announcing OMB approval and announcing the effective date for when net neutrality protections will end. That date is still unknown.

A small portion of the December 2017 order will take effect today (mostly cosmetic changes), but the bulk of the order (including the part of the decision that reclassified internet service) will not go into effect for a while longer, meaning strong net neutrality protections are still in place for the time being. In the meantime, work continues to advocate that Congress use the Congressional Review Act to roll back the December order, ALA and others are considering how best to support legal challenges in court, and state-level legislation and executive orders continue to be debated.

To help the library community better understand this critical issue, ALA Washington Office’s Senior Fellow Robert Bocher has crafted a Frequently Asked Questions document. The FAQ provides background information on previous FCC actions and court decisions on net neutrality going back over a decade and – in going forward – it reviews the possible impact of the FCC’s recent decision. In addition, the FAQ explains why an open internet is important to libraries and ALA’s commitment to intellectual freedom. References to more information on net neutrality also are provided.

The post Net neutrality protections still in place (for now); ALA releases new FAQ appeared first on District Dispatch.

Thom Hickey: Astronimcal FITS images

Sun, 22 Apr 2018 18:12:05 +0000

Now for something a little different: Since retiring from OCLC I don't do a lot with library metadata, but I've recently had some fun exploring astronomical images, which come with their own data/metadata format FITS, the Flexible Image Transport System. Everyone that wants to share astronomy data uses FITS.  It was developed in the early 1980's and has a strong FORTRAN flavor in the how the data is stored.  Having processed the variable length fields inherent in MARC records with FORTRAN I can appreciate the attractiveness of fixed length blocks, arrays of binary data and 80-byte card images to the engineers/scientists of the time. One of the wonders of our time is all the astronomical work that is being done, and that within a year or two of the observations much of the data is publicly available.  The image at the head of this post came from the Hubble Legacy Archive which has an interface that will allow you to search by name and star catalog numbers, select the type of image you are interested in, and view previews of images before downloading the FITS file. Of course most of the fun in working with the images is writing some of the code that makes it possible.  There are lots of programs available that will help you look at FITS files, such as FITS Liberator which will 'liberate' FITS images into something that Photoshop can process.  Those are nice, but farther away from bare metal than I like to be.  So I wrote a little program in J that does some rudimentary processing with FITS data.  J (download it here) is a slightly obscure (but actively used and maintained) language derived from APL. Possibly more accurately J evolved from APL in an effort led by Kenneth Iverson, the inventor of APL.  While it does take some initial effort to become proficient in array languages such as J, it is remarkable how much can be done in a few characters.  Admittedly those few characters may take some deciphering, but so would the much longer code they replace.  In some ways it reminds me of trying to use a new alphabet, such as Cyrillic.  At first the script is confusing or actually misleading, but once learned they just become letters.  I got introduced to APL in the late 1970's when I first joined OCLC.  At the time OCLC ran on Sigma computers from Xerox/Honeywell.  Xerox tried to compete with IBM in the early 70's and  APL that was one of the few languages available on Sigma machines (in general we did most things in CP-V assembler which was really quite nice).  Their APL was clunky and slow and OCLC didn't have an APL terminal, but it worked and I used it to do some research into how people were using search keys on the OCLC system (not so well!). J can be described as a fusion of APL and Backus's FP.  It is open source, easy to install, does not require a special alphabet and the things it can do with arrays are amazing, if not always immediately obvious.  One of the things I like about it is the brevity of the code.  Having experimented with compact code in Python (Z39.50 client on a t-shirt),  it is surprisingly easy to work with dense code because you can see so much of it at once. J does invite a certain amount of points-free coding, a style that confuses me at times, but can be quite elegant. Map-reduce, is another style of functional programming that can be difficult at first to get comfortable with, but turns out to be very powerful.  We used map-reduce extensively at OCLC, so I came to J with some familiar[...]

Eric Lease Morgan: Word clouds with Wordle

Sun, 22 Apr 2018 14:36:02 +0000

(image) A word cloud, or sometimes called a “tag cloud” is a fun, easy, and popular way to visualize the characteristics of a text. Usually used to illustrate the frequency of words in a text, a word clouds make some features (“words”) bigger than others, sometimes colorize the features, and amass the result in a sort of “bag of words” fashion.

Many people disparage the use of word clouds. This is probably because word clouds may have been over used. The characteristics they illustrate are sometimes sophomoric. Or too much value has been given to their meaning. Despite these facts, a word cloud is an excellent way to initialize the analysis of texts.

There are many word cloud applications and programming libraries, but Wordle is probably the easiest to use as well as the most popular. † [1] To get started, use your Web browser and go to the Wordle site. Click the Create tab and type some text into the resulting text box. Submit the form. Your browser may ask for permissions to run a Java application, and if granted, the result ought to be simple word cloud. The next step is to play with Wordle’s customizations: fonts, colors, layout, etc. To begin doing useful analysis, open a file from the workshop’s corpus, and copy/paste it into Wordle. What does the result tell you? Copy/paste a different file into Wordle and then compare/contrast the two word clouds.

(image) By default, Wordle make effort to normalize the input. It removes stop words, lower-cases letters, removes numbers, etc. Wordle then counts & tabulates the frequencies of each word to create the visualization. But the frequency of words only tells one part of a text’s story. There are other measures of interest. For example, the reader might want to create a word cloud of ngram frequencies, the frequencies of parts-of-speech, or even the log-likelihood scores of significant words. To create the sorts of visualization as word clouds, the reader must first create a colon-delimited list of features/scores, and then submit them under Wordle’s Advanced tab. The challenging part of this process is created the list of features/scores, and the process can be done using a combination of the tools described in the balance of the workshop.

† Since Wordle is a Web-based Java application, it is also a good test case to see whether or not Java is installed and configured on your desktop computer.

[1] Wordle –

LITA: LITA, LLAMA, ALCTS collaboration FAQ #4: expressions of support, LITA/Code4Lib collaboration

Fri, 20 Apr 2018 15:58:54 +0000

On February 23, I posted for discussion a proposal on a closer formal relationship between LITA, LLAMA, and ALCTS. That included an anonymous feedback form where you can ask questions, express feelings, et cetera. I will be collating and answering these questions every few weeks here on LITAblog (so please keep asking!). Since that time I’ve gotten six (!) questions/comments. I’m going to break them up across several posts; here are the last four. (The first two are addressed in the previous post.) As a member of ALA for over 40 years, I applaud this effort and would urge that these 3 divisions actually merge into one unit to reflect the changing structure and integration of libraries today….It’s been very confusing for a long time over who does what program and duplication is wasteful. And paying dues for 3 different sections has been frustrating….Good luck with all this! Rejoice! I like simplification and this merger makes sense to me in this day and age of complexities. I’m a director of cataloging at a university and active in ALCTS, but not in LITA or LLAMA. I did try LITA for a year, and while really interesting, it was a bit too far out of my regular scope of work to justify continuing to pay for it. I have frequently flirted with the idea of joining LLAMA, particularly as I take on increasing responsibility for management and leadership. Personally, I think a merger of all 3 organizations would be the most exciting and potentially useful direction for members (at least it would be for me, and I think there are others like me out there). I would love to be able to remain in my ‘home’ organization but also benefit from LLAMA resources and find ways to communicate, connect, and collaborate with LITA folks. I would ideally also like to do this without paying additional membership charges, although I don’t care what we call the new organization as long as everyone is happy. I think there is a wealth of conversations, programming, and very healthy cross-pollination that could come out of this merger, and really look forward to it happening. And I do hope it is a merger and not just a realignment, as I think a realignment brings the risk of these organizational silos being redeveloped in the future. Thank you all for your involvement and your support. Has anyone considered a collaboration between LITA and Code4Lib? Yes! I want to acknowledge first that collaborating with Code4Lib doesn’t work the same way that organizational collaboration usually does, since they’re a decentralized collective without an incorporated legal form or a governance structure. Code4Lib is fundamentally a do-ocracy and therefore it’s frequently easier for Code4Lib to approach LITA than the reverse. That said, there’s a ton of informal collaboration going on. Many involved LITA members are also involved Code4Lib members. I’m actually going back and forth between this WordPress window and the Code4Lib Slack channel right now, and I keynoted the conference a few years ago; LITA VP Bohyun Kim was on the planning committee for the most recent Code4Lib. (I’m in awe of her time management skills.) I can think of a long list of people off the top of my head who have had significant roles in both communities. If both are relevant to your interests, involvement in both is excellent and I highly encourage it. LITA was also among the organizations that offered to act as fi[...]

LITA: LITA, LLAMA, ALCTS collaboration FAQ #3: finding your niche, serving personas, participatory communications

Fri, 20 Apr 2018 15:39:57 +0000

On February 23, I posted for discussion a proposal on a closer formal relationship between LITA, LLAMA, and ALCTS. That included an anonymous feedback form where you can ask questions, express feelings, et cetera. I will be collating and answering these questions every few weeks here on LITAblog (so please keep asking!). Since that time I’ve gotten six (!) questions. I’m going to break them up across several posts; here are the first two. [S]ince tech is such a big field…[t]here is quite a lot that is relevant to me [in LITA], but it is difficult to find sometimes within discussions….I’m worried about how much a merger with ALCTS and LLAMA would amplify this problem. This is definitely a thing that’s on our minds as well. It can be hard to find your niche within an organization as large as ALA, and part of the role of the divisions is to make that easier; we want to make sure everyone would still be able to find their home. In the near future, you should see a membership survey asking about what you find valuable in your LITA, LLAMA, and/or ALCTS membership. This, plus recent work by LITA and LLAMA to learn about our memberships (e.g. the LITA Personas Task Force), will guide our thinking on how a combined division could be structured in order to retain the touchpoints that are most valuable to people. Combining divisions would also significantly reduce the amount of administrative overhead, which would free staff time to focus on member engagement — LITA Executive Director Jenny Levine has a lot of great ideas on that front that she doesn’t currently have time to implement. How is combined LITA/ALCTS/LLAMA going to serve their unique personas? This question included a lengthy set of follow-up questions and statements, including “I can see to some degree combining LITA and ALCTS; they are both technology-themed organizations”; “What was the result of the LITA personas study a few years back?”; questions about the benefit to LLAMA; and “Perhaps we should at least have a participative webinar in which LITA and ALCTS members can share their thoughts in a public forum”. I will leave questions about the benefits of LLAMA for the LLAMA leadership to answer. You can read a summary of the Personas TF work here on LITAblog, or the full report. One of our LITA personas is the administrative member. It turns out that people with titles like “Director of Libraries” or “Head of IT” make up a high percentage of LITA members; this is a group that came up through technology and loves their LITA network, but doesn’t always find the content they need today via LITA. A LITA/LLAMA connection makes a lot of sense for this group. There are certainly LITA members who have no aspiration to be in leadership and thus might not get new opportunities via LLAMA, and that’s okay! Even as it is today, LITA has a lot of niches that aren’t relevant to all its members. For instance, our interest groups include Heads of Library Technology and E-Rate & CIPA Compliance – neither of them directly serve me (I’m not a head of IT or a public or school librarian), but they’re great resources for their members, and I’m very glad that we can provide those spaces for people who are engaged in those issues day-to-day. In re the webinar, I enthusiastically agree. In fact, I think we should have a lot of participatory webinars. And social me[...]

LITA: Congratulations to Heather Moulaison Sandy, winner of the 2018 LITA/Library Hi Tech Award

Thu, 19 Apr 2018 21:36:28 +0000

Heather Moulaison Sandy has been named the winner of the 2018 LITA/Library Hi Tech Award for Outstanding Communication in Library and Information Technology. Emerald Publishing and the Library and Information Technology Association (LITA) sponsor the Award, which recognizes outstanding individuals or institutions for their long-term contributions in the area of Library and Information Science technology and its application.

The Award Committee selected Moulaison Sandy because it was impressed with her extensive contributions to ongoing professional development across the discipline., which include five books and more than 25 peer-reviewed journal articles. Her work has been presented at over 100 local, national, and international venues in nearly 15 countries as well as at numerous online webinars and talks.

Moulaison Sandy is Associate Professor at the iSchool at the University of Missouri and works primarily at the intersection of the organization of information and the online environment. She is a recipient of this year’s JRLYA/YALSA Writing Award, as well as the ALISE/OCLC 2016 Research Grant and the ALA Carnegie Whitney  2016 grant.

An avid Francophile and traveler, she was named an Associated Researcher at the French national school for library and information science (Enssib) in 2014, and received a Fulbright Senior Scholar grant in 2008-2009 to teach at l’Ecole des sciences de l’information in Morocco. She holds a PhD in Information Science from Rutgers and an MSLIS and MA in French, both from the University of Illinois at Urbana-Champaign.

When notified she was this year’s recipient, Moulaison Sandy said, “Receiving this award is a true honor, and I am thrilled to join the ranks of LITA/Library Hi Tech award recipients whose work I admire so much.” She will receive a citation and a $1,000 stipend.

Members of the 2018 LITA/Library Hi-Tech Award Committee are: Dr. Patrick T. Colegrove (Chair), Vanessa L. Ames (Past Chair), Holli Kubly, and Christina D. Mune.

Thank you to Emerald Publishing for sponsoring this award.



Evergreen ILS: Evergreen 3.1.1 and 3.0.7 released

Thu, 19 Apr 2018 17:03:29 +0000

The Evergreen community is pleased to announce two maintenance releases of Evergreen, 3.1.1 and 3.0.7.

Evergreen 3.1.1 has the following changes improving on Evergreen 3.1.0:

  • Fixes an issue that prevented patron alerts from showing to staff at other libraries.
  • Corrects the “Holdable” attribute display on the Item Status detailed view.
  • Fixes the ability to delete multiple copies from Item Status.

Evergreen 3.0.7 has the following changes improving on Evergreen 3.0.6:

  • Fixes a performance issue with the Patron Billing History screen and other screens that cause to re-create joins unnecessarily.
  • Fixes an issue that prevented patron alerts from showing to staff at other libraries.
  • Corrects the “Holdable” attribute display on the Item Status detailed view.
  • Fixes the ability to delete multiple copies from Item Status.

Please visit the Evergreen downloads page to download the upgraded software and to read full release notes. Many thanks to everyone who contributed to the releases! In particular, we would like to acknowledge the buildmasters for these releases, Dan Wells and Chris Sharp, and the writer of the release notes, Jane Sandberg.

District Dispatch: School library literacy grants available

Thu, 19 Apr 2018 17:00:11 +0000

This week, the U.S. Department of Education announced the beginning of the application period for Innovative Approaches to Literacy (IAL) grants. This program, which is open to school libraries, provides $27 million in federal funding to support the improvement of literacy skills for youth in high-need schools and communities.(image)

Applicants have until May 18 to submit a grant proposal to the Department. Of note this year, the Department has placed a priority on funding proposals that meet specific priorities. Grant applications demonstrate a clear rationale for the activities being proposed, and that promote the use of STEM as part of its program, and/or implement programs in rural communities or schools could be awarded additional points during consideration. Full details regarding the application process and submission requirements are available through the link above.

At least half of all IAL grants are reserved for the development and improvement of an effective school library program. Grants can be used for book distribution, professional literacy development for librarians and purchase of up-to-date library materials. The Department of Education notes that IAL funds are intended to support high-quality programs designed to develop and improve literacy skills for children and students from birth through 12th grade in high-need local educational agencies and schools. National non-profit organizations are also eligible to apply for IAL grants and may develop and/or implement programs while partnering with school libraries. The Department of Education estimates it will issue grants to as many as 30 schools and six non-profits.

Recently, ALA members advocated to support IAL funds during the annual “Dear Appropriator” campaign. Letters in the House (signed by 93 Representatives) and Senators (signed by 35 Senators) called for continued funding for the only program with dedicated funding for school libraries. The FY 2018 Omnibus package recently passed by Congress included level funding for IAL at $27 million.

The post School library literacy grants available appeared first on District Dispatch.

District Dispatch: ALA spotlights broadband in Tribal and rural libraries for National Library Week

Thu, 19 Apr 2018 13:40:37 +0000

The American Library Association’s (ALA) Washington Office hosted senior policymakers, librarians and telecommunications experts from across the nation for a National Library Week luncheon panel to discuss broadband in Tribal and rural libraries on Thursday, April 12, in the U.S. Capitol Building. The panel, moderated by National Museum of the American Indian Librarian Elayne Silversmith, focused on how broadband connectivity and telecommunications infrastructure in Tribal and rural regions advances education, provides economic opportunity and can close the digital divide. Panelists included Cynthia Aguilar, Librarian, Santo Domingo Pueblo, New Mexico; Hannah Buckland, ALA Policy Corps member and Director of Library Services, Leech Lake Tribal College (Minn.); Irene Flannery, Director of AMERIND Critical Infrastructure; and Kelly Wismer, Public Relations Manager at NTCA – The Rural Broadband Association. Tribal librarian Cynthia Aguilar explained that establishing adequate broadband infrastructure will be as life-changing for her community as the introduction of the railroad. Aguilar was quoted in The Washington Post article: “Once the fiber optics are lit, it will be black and white. It will be so spectacular.” The panel discussion was bookended by keynote speeches from Senator Martin Heinrich (D-NM) and Federal Communications Commissioner (FCC) Mignon Clyburn, with ALA President Jim Neal emceeing the event. Senator Heinrich’s opening remarks highlighted the Tribal Connect Act of 2017 (S.2205), a bipartisan bill he introduced with U.S. Senator Dean Heller (R-NV) to improve broadband infrastructure and connectivity in Native American communities. “I’m pleased to partner with the American Library Association to convene this important discussion on closing the digital divide in Indian Country and continue building the momentum for the Tribal Connect Act,” said Senator Heinrich. “The Tribal Connect Act is an investment in broadband infrastructure and high-speed internet access in Indian Country so all of our students and children can compete on an even playing field and learn the skills they need to succeed in the 21st century. Connecting more Tribes to the E-rate program will strengthen broadband across rural New Mexico and improve education, boost the economy and increase public safety and civic engagement.” ALA President Jim Neal emphasized the Association’s support of the bill and commitment to its members in tribal and rural communities. “For many people in Tribal and rural areas, the lack of high-speed internet access means that competing in today’s economy is a steep climb and becoming steeper. Improving access to the E-rate program is a strong start toward improving high-speed internet access to the least connected people in America. The American Library Association wholeheartedly supports the Tribal Connect Act and looks forward to advocating for its passage.” The Tribal Connect Act is supported by ALA; the National Congress of American Indians; National Indian Education Association; AMERIND Risk; and the Association of Tribal Archives, Libraries and Museums. Commissioner Clyburn delivered closing remarks. “As a longtime champion for the FCC’s E-rate program and a daughter of a retired librarian, I believe [...]

District Dispatch: Senate Foreign Relations Committee hearing on the Marrakesh Treaty

Thu, 19 Apr 2018 13:04:44 +0000

The Senate Foreign Relations Committee held a hearing yesterday on the Marrakesh Treaty Implementation Act (S. 2559). If passed, the legislation would make available an additional 350,000 accessible books for people with print disabilities living in the United States, according to Manisha Singh, Assistant Secretary, Bureau of Economic and Business Affairs at the U.S. State Department. In her testimony, Singh noted the extensive preparatory work that went into crafting the Marrakesh Treaty, an international copyright exception that would allow authorized entities (including libraries) to make accessible copies of works and distribute them across international borders. The Marrakesh Treaty was signed by member countries of the World Intellectual Property Organization (WIPO) in June 2013 at an international negotiation conference held in Marrakesh. Scott LaBarre, counsel for the National Federation for the Blind (NFB) recounted that, at the time, there were 37 issues with the draft treaty that had not yet been worked out among the stakeholders including publishers, librarians, and the beneficiaries of the treaty, people with print disabilities around the world. When nearly all hope was lost—because there was deep opposition to the treaty—the King of Morocco showed up and said that he would close the airports if the players had not reached consensus. The subtle threat worked. Since that time, sustained commitment from the stakeholders finally led to the introduction of the Marrakesh Treaty Implementation Act. Jonathan Band spoke on behalf of the Library Copyright Alliance (LCA), a coalition consisting of ALA, the Association of Research Libraries (ARL) and the Association of College and Research Libraries (ACRL). In his testimony, Band explained: “The Marrakesh Treaty creates a system that allows the cross-border exchange of accessible format copies between countries that have joined the Treaty… With digital formats such as renewable braille or audio books, American with print disabilities would be able to access foreign books within minutes of requesting them.” In their opening statements, Senator Cardin (D-Maryland) applauded the efforts of the National Federation of the Blind (NFB) which is headquartered in Baltimore. He called their work a “constructive force promoting the quality of life.” Cardin noted the human rights importance of the treaty, praising the United States as a model for the rest of the world. He noted that the impact on the United States was obvious but by ratifying the treaty we would motivate other countries to ratify. Senator Kaine (D-Virginia) read a quote from Susan Paddock, a librarian at the Bayside and Special Services Library, Department of Public Libraries in Virginia Beach said that her library users were voracious readers who often ran out of books to read because of the limited number. Paddock said, “Can you imagine running out of books to read?” Something taken for granted by most people—access to reading materials—so it was a given that the Library Copyright Alliance would support the Marrakesh Treaty for more than 10 years, all told. Everything was so positive and upbeat, Senator Corker (R-TN), chair of the committee said there was no need “to grill” the pa[...]

DuraSpace News: CALL for Participation: The DSpace Anwendertreffen 2018

Thu, 19 Apr 2018 00:00:00 +0000

From Pascal-Nicolas Becker, The Library Code GmbH

DuraSpace News: Call for participation: DSpace Anwendertreffen 2018

Thu, 19 Apr 2018 00:00:00 +0000

From  Pascal-Nicolas Becker,  The Library Code GmbH The DSpace User Meeting 2018 will take place at the University of Berlin, on Thursday, 13th September 2018.

DuraSpace News: Attend Upcoming CASRAI Community Events in Canada, UK and Europe

Thu, 19 Apr 2018 00:00:00 +0000

CASRAI has a number of annual events taking place in our various chapters in the coming months. We hope you will see an event near you with a programme that catches your interest. Reconnect events are an opportunity for you to join the CASRAI community in your country as we work together to make research management information more efficient and effective throughout an ever changing research management lifecycle. You don’t have to be a member to attend and you can attend any of the events listed below based on your location and interests.

District Dispatch: Library makerspaces come to Capitol Hill with Digital Fab!

Wed, 18 Apr 2018 23:32:19 +0000

(image) Rep. Ben Ray Luján (D-N.Mex.) presenting remarks.

On Wednesday, April 11, the American Library Association’s (ALA) Washington Office along with Rep. Ben Ray Luján (D-N.Mex.), hosted “Digital Fab! Libraries Advance Entrepreneurship and Innovation” in honor of National Library Week. The event, which took place on Capitol Hill, showcased library makerspaces and the important technology and services they provide to patrons.

With help from the DCPL’s Fabrication Lab (Fab Lab) staff, guests were able to get a hands-on experience with 3D printers and other computer-enabled technologies. In addition, more than 50 public and academic libraries from across the nation submitted images of their makerspaces for display at the event to highlight the growing presence of library maker centers throughout the country.

Attended by congressional staff and supporters of community maker centers, the event included a brief program with remarks by ALA President Jim Neal, Rep. Ben Ray Luján, and DC Public Library (DCPL) Executive Director Richard Reyes-Gavilan. The purpose of the event was to help demonstrate the role of libraries in advancing STEM education and the maker economy and in providing public access to tools that are revolutionizing the manufacturing industry.

(image) DCPL Adult Librarian Esti Brennan with Archivist of the United States David S. Ferriero.

As Jim Neal said in his remarks, “Today’s libraries not only fuel the imagination, they propel innovation and entrepreneurship through new technology.”

We hope that the success of Digital Fab! will further the discussions with members of Congress and their staff about libraries as anchor institutions in their communities.

Do you have a makerspace in your library? Invite your representative to tour your facility. Need help arranging a meeting? Call the ALA Washington Office at (202) 628-8410 – we are here to help!

The post Library makerspaces come to Capitol Hill with Digital Fab! appeared first on District Dispatch.

LITA: Jobs in Information Technology: April 18, 2018

Wed, 18 Apr 2018 19:15:32 +0000

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Winona State University, Electronic Resources Librarian, Winona, MN

Santa Barbara City College, Librarian – (Web Services and eResources), Santa Barbara, CA

California Digital Library, Metadata Product Manager (career position), Oakland, CA

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Access Conference: Peer reviewers needed!

Wed, 18 Apr 2018 18:23:28 +0000

Access 2018 is seeking peer reviewers to help select presentations for the conference. Peer reviewers will be responsible for reading Access Conference session proposals and reviewing them for selection.

Peer reviewers will be selected by the Program Committee and will ideally include a variety of individuals across disciplines including Access old-timers and newbies alike. You do not need to be attending the conference to volunteer as a peer reviewer.

The peer review process is double-blind for the presentations only. Those who have submitted Access Conference session proposals are welcome and encouraged to become peer reviewers as well. You won’t receive your own proposal to review. Please note that the peer reviewing activity will take place between June 7 and June 21.

To be considered as a peer reviewer, please attach your abridged CV (max. 5 pages) and provide the following information in the body of an email:

  • Name
  • Position and affiliation
  • A few sentences that explain why you want to be a peer reviewer for Access 2018

Please submit this information to by Friday, June 1, 2018.


Open Knowledge Foundation: Our Open Data Day 2018 @

Wed, 18 Apr 2018 13:30:26 +0000

This blog has been reposted from Medium This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The event in this blog was supported through the mini-grants scheme under the Equal Development theme. The Open Data Day 2018 hackathon at our place near Barbarossa Platz began in the morning (at 9am) and went all day long until 8pm in the evening. The different participants (or hackathletes as we like to call them) came from all parts of Cologne and even from other cities throughout the province of Nordrhein-Westfalen. The topic of our hackathon was (as intended) air pollution and nitrogen dioxide pollution in particular. All of which can’t be talked about enough since the pollution is invisible but its impact on our health is not. And especially in the inner city of Cologne there is lots and lots of air pollution. We had breakfast together and got each other to know while drinking a coffee or two. So we had time to appreciate our similarities and differences before we started working. We had about thirty participants pitching around ten ideas, which ultimately formed themselves into three groups, each of them with their own goal based on the best pitches made. The three different projects that were realized during our hackathon for Open Data Day 2018 were, as already stated, all about nitrogen dioxide pollution in Cologne and all of the hacks worked with (or compared) the different figures from the two main sources that monitor air pollution in our area. The first one being the City of Cologne, the second one being Open Air Cologne, a joint venture by OKlab Cologne, Everykey, the Cologne University of Applied Sciences and again the City of Cologne themselves. Our hackers went on to built python scripts, parsers and APIs to transmit data, transform data, to compare the measurements between the two data sources and to visualize them and make the data machine-readable for other users and visualizations. Concerning the accomplishment of goals we are happy to announce that one project was completely finished and the two runner ups were almost finished and in a working condition. Also the goal of connecting people and keeping them connected was accomplished since some of the participants are still in email communication concerning their projects. Our community did a great deal of furthering the cause. We had a principal direction where we wanted to go but the projects/hacks were all planned and formed by the participants themselves. Also the two hour long Barcamp that was held helped a lot in giving the pitches shape and furthering the scope of each project. Still through feedback we got the insight that it might have been even more productive to be a little bit more strict in g[...]

In the Library, With the Lead Pipe: Critical Optimism: Reimagining Rural Communities through Libraries

Wed, 18 Apr 2018 12:42:20 +0000

In Brief: In the absence of governmental agencies and philanthropic support, many rural communities see their local library as the last civic, cultural, or service organization in town. This reality presents obvious challenges to the librarian, and also incredible opportunity. As the primary convener, libraries have the ability to facilitate regeneration in the communities they serve. This article situates rural librarianship within an organizing framework for change and discusses applications of community engagement tools and measures of impact aligned with social wellbeing. By Margo Gustina Introduction Rural libraries have the professional obligation, opportunity, and ability to facilitate positive transformational change in the communities they serve. Rural libraries in the United States serve communities in decline. After years of population exodus, the remaining demographic is overwhelmingly made up of immobilized individuals and families. As agencies have made service decisions based on population densities, an ever increasing number of communities have been abandoned by human service and civic institutions. In this environment, rural libraries fill the ever widening gap between resources and needs. The rural librarian, to catalyze change and facilitate the realization of community aspirations, will become an organizer who will measure progress in terms of equitable social wellbeing, rather than ROI and circulation statistics. Reimagining Communities In the gap between resources and needs, rural libraries can facilitate the re-imagining of their communities. Utilizing organizing frameworks with community engagement tactics, social wellbeing measurements, and regenerative design principles, libraries surviving in service deserts have unprecedented opportunities to realize deep, impactful, just, and long lasting changes in their communities. To be organizers of community action, library people must acknowledge the intersection of privilege and domination that exists along ability, citizenship status, class, culture, gender, language, race, and sexuality lines. The full breadth of community diversity has to be appreciated, and members must hold agency in any change and decision making process. It is only through full inclusion that the full potential of systemic change can be realized. That necessitates the purposeful and active deconstruction of current barriers to access and agency for community members marginalized by current systems of domination. Within the interrelated models presented here, libraries can organize and work with their members—the people whom they are to serve. In order for the organization itself to be inclusive and whole, the library organization members need to acknowledge and maintain awareness of the systems of domination they are working against in building equity and justice. For true community potential to be realized, community members themselves must be the designers of their future. Existing Rural Conditions To fully appreciate how the models discussed here apply t[...]

Open Knowledge Foundation: Women Economic and Leadership Transformation Initiative Open Data Event 2018

Wed, 18 Apr 2018 10:27:28 +0000

This blog has been reposted from debwritesblog This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The event in this blog was supported through the mini-grants scheme under the Equal Development theme. We are living in times where it seems very obvious to want certain situations. One of them is the presence of women in all professional fields. Who would not agree that such representation should be fair and equal with respect to the opposite gender? Perhaps nobody would oppose it in public, but the reality is different. Women are not balanced in all professional environments, and more and more cases are reported that reflect the way they are rewarded for their work is not fair. When it comes to open data it is a different situation. It does not take gender into consideration: instead it serves as an empowerment tool for any individual who is interested in making use of it. Open data Open data – data anyone can access, use or share – is transformative infrastructure for a digital economy that is consistently innovating and bringing the benefits of the Web to society. It often goes hand in hand with open working cultures and open business practices. While this culture lends itself to diversity, it is important that those who are involved in open data make sure it addresses everyone’s needs. It is therefore encouraging to see that open data initiatives in African countries are being led by women. From heading up technical teams to leading stakeholder engagement strategies, these leaders are driving open data across the continent. The Women Economic and Leadership Transformation Initiative (WELTI) in partnership with The Hewlett Foundation, Open Knowledge International and SPARC organized a day event to celebrate Open data day 2018 on the 3rd March 2018 at the Fountain Heights SecondarySchool, Surulere where the speakers spoke on “Understanding gender inequality  through open  data /knowledge”  and “The role of data and business in a woman’s world” respectively to 70 young women, young men and some teachers. Key message shared One of the female speakers noted that the proportion of women using the internet is 12% and that the percentage of women who have access to the internet is 50% lesser to that of the men. In her opinion, advocacy on gender inequality pertaining to the usage of data can be achieved through: 1. Proper orientation. People need to be enlightened on the use of data and it’s far reaching impact in the society. 2. E-learning centers should increase so that more women can gain access to the internet especially in rural areas. It was also established that data can go a long way in [...]

Fiona Bradley: อาชีพน่าทำ สำหรับคนที่กำลังหางานใหม่ๆ

Wed, 18 Apr 2018 08:07:37 +0000

อาชีพที่น่าทำในที่นี้จะขอพูดถึงอาชีพอิสระค่ะ เพราะในยุคนี้งานประจำอย่างเดียวอาจจะไม่พอ อาชีพอิสระหลายๆ อาชีพก็ยังคงเป็นที่นิยมและสามารถทำเงินได้เรื่อยๆ ในขณะที่บางอาชีพฮิตติดชาร์ตได้ประเดี๋ยวประด๋าวก็ซบเซาไปตามกาลเวลา เอาเป็นว่าเราไปดูอาชีพต่างๆ ที่ยังคงแรงต่อเนื่องและทำเงินได้เรื่อยๆ กันค่ะ

1.) ขายของออนไลน์

อาชีพนี้เป็นอาชีพที่มาแรงมากๆ ในยุคนี้นะคะแม้เวลาจะผ่านไปหลายปีแต่ก็ไม่ได้ทำให้ผู้ค้าออนไลน์ลดน้อยลงไปเลย กลับนับวันจะเพิ่มขึ้นเรื่อยๆ นั่นเพราะในยุค IT เช่นนี้ ความสะดวกสบายใจการจับจ่ายใช้สอย นับเป็นเรื่องที่ใครหลายคนชื่นชอบค่ะ แค่คลิกเดียวของส่งถึงบ้าน ไม่ต้องไปเดินตลาดให้เหนื่อยกาย ทั้งนี้ ของที่ขายต้องเป็นที่นิยมด้วยนะคะ ที่มาแรงอย่างหนึ่งคือเสื้อผ้า โดยเฉพาะแนวชิคๆ คูลๆ ที่วัยรุ่นชอบกัน ยังไงก็ไม่มีตกเทรนค่ะ


2.) ช่างภาพอิสระ

ในยุคที่สื่อโซเชียลเข้ามามีบทบาทในชีวิตประจำวันเรามากมายเกือบจะ 100% เลยก็ว่าได้ การโพสต์รูปต่างๆ เพื่อบอกเล่าเรื่องราวที่ประสบพบเจอในแต่ละวัน ถือเป็นกิจวัตรประจำวันของเราไปแล้ว ทำให้อาชีพช่างภาพอิสระ กลับมาเป็นที่นิยมอีกครั้งเมื่อต้องการรูปในวันสำคัญ โดยเฉพาะงานรับปริญญา ซึ่งถ่ายรูปกันตั้งแต่วันซ้อมย่อย ซ้อมใหญ่ แถมยังมีถ่ายรูปนอกรอบกันอีก งานแต่งงาน ตั้งแต่ถ่ายพรีเวดดิ้ง ในที่ต่างๆ ถ่ายรูปในวันจริง เหล่านี้เพื่อให้ได้ภาพที่เป็นที่ประทับใจ และลงรูปใน FB / IG ได้แบบภูมิใจค่ะ หรือแม้แต่ในการถ่ายภาพนิ่งสำหรับนิตยสารต่างๆ ก็ยังเป็นที่ต้องการค่ะ แถมยังสามารถทำเงินได้เป็นกอบเป็นกำจากการขายภาพสต๊อกด้วยนะคะ



3.) ช่างแต่งหน้าอิสระ

อาชีพนี้มาพร้อมๆ กับช่างภาพอิสระค่ะ เรียกว่ามีช่างแต่งหน้าที่ไหนต้องมีช่างภาพที่นั่น ซึ่งอาชีพช่างแต่หน้านี้ได้รับเสียงตอบรับจากผู้ใช้บริการมากขึ้นกว่าในสมัยอดีต เพราะโลกเปลี่ยนไปความนิยมก็เปลี่ยนตามค่ะ ก็ตั้งแต่ที่ถือกำเนิดเกิด Face book , Instagram ขึ้นมา การจะออกสื่อก็จะมาหน้าสดไม่ได้แล้วค่ะ ดังนั้นงานเพื่อนเจ้าสาว งานเจ้าสาว งานถ่ายพรีเวดดิ้ง งานรับปริญญา ซ้อมใหญ่ ซ้อมย่อย งานเลี้ยงรุ่น งานรื่นเริง แม้กระทั่งงานวันเกิด อาชีพช่างแต่งหน้าก็วิ่งงานกันไม่หวาดไม่ไหว เรียกว่าไม่ได้หลับนอนกันทีเดียวค่ะ ซึ่งที่เล่ามาทั้งหมดนี้เป็นเพียงส่วนหนึ่งของงานอาชีพนี้นะคะ ความจริงแล้วยังมีแหล่งงานอีกเพียบ สำหรับอาชีพช่างแต่งหน้าอิสระค่ะ
ใครที่กำลังมองหางานใหม่ๆ หรือสร้างอาชีพใหม่ๆ เป็นทางเลือกให้ตัวเองก็ลองดูความชอบส่วนตัว แล้วหาความรู้เป็นจริงเป็นจัง ก็ไม่เลวนะคะ



Peter Murray: Privacy in the Context of Content Platforms and Discovery Tools

Wed, 18 Apr 2018 04:00:00 +0000

These are the presentation notes for the Privacy in the Context of Content Platforms and Discovery Tools presentation during the NISO Information Freedom, Ethics, and Integrity virtual conference on Wednesday, April 18, 2018. The full text of the talk and the slides will be posted later today. ALA Code of Ethics Privacy: An Interpretation of the Library Bill of Rights
 Panopticlick - Electronic Frontier Foundation Cross-Site Script Inclusion: A Fameless but Widespread Web Vulnerability Class A Face Is Exposed for AOL Searcher No. 4417749 - The New York Times Data Sets Not So Anonymous - MIT Technology Review De-identification and Patron Data - Intellectual Freedom Blog Complying With New Privacy Laws and Offering New Privacy Protections to Everyone, No Matter Where You Live - Facebook Newsroom People, your biggest asset and greatest risk to GDPR compliance Publishers Haven’t Realized Just How Big a Deal GDPR is - Baekdal Plus [...]

DuraSpace News: DuraCloud Selected as a Featured Open Source Project for Mozilla Global Sprint

Wed, 18 Apr 2018 00:00:00 +0000

DuraSpace is excited to announce that its project, Open Sourcing DuraCloud: Beyond the License has been selected as a featured project for the Mozilla Foundation Global Sprint to be held May 10th-11th, 2018.

HangingTogether: What’s changed in linked data implementations in the last three years?

Tue, 17 Apr 2018 20:42:41 +0000

Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. by CC-BY-SA license. OCLC Research conducted an “International Linked Data Survey for Implementers” in 2014 and 2015, attracting responses from a total of 90 institutions in 20 countries.  In the 2015 survey, 168 linked data projects or services were reported, of which 112 were described; 61% of them had been in production for over two years. This represented a doubling of the number of relatively “mature” linked data implementations compared to the 2014 results. We are curious – what might have changed in the last three years? OCLC Research has decided to repeat its survey to learn details of new projects or services that format metadata as linked data and/or make subsequent uses of it that have launched since the last survey. And we are interested in what might have changed in linked data implementations or plans reported in the previous surveys. The questions are mostly the same so we can more easily compare results. The target audience are staff who have implemented or are implementing linked data projects or services, either by publishing data as linked data, by consuming linked data resources into their own data or applications, or both. So if you have implemented or are implementing a linked data project or service, please take the 2018 survey! The link: We are asking that responses be completed by 25 May 2018. As with the previous surveys, we will share the examples collected for the benefit of others wanting to undertake similar efforts and add the responses to those from the 2014 and 2015 surveys (without contact information) available in this Excel workbook. What do you think has changed in the last three years? [...]

Code4Lib: Code4Lib Journal Issue 41 Call for Papers

Tue, 17 Apr 2018 14:05:18 +0000

Call for Papers (and apologies for cross-posting):

The Code4Lib Journal (C4LJ) exists to foster community and share information among those interested in the intersection of libraries, technology, and the future.

We are now accepting proposals for publication in our 41st issue. Don't miss out on this opportunity to share your ideas and experiences. To be included in the 41st issue, which is scheduled for publication in August 2018, please submit articles, abstracts, or proposals at or to by Friday, May 11, 2018. When submitting, please include the title
or subject of the proposal in the subject line of the email message.

C4LJ encourages creativity and flexibility, and the editors welcome submissions across a broad variety of topics that support the mission of the journal. Possible topics include, but are not limited to:

* Practical applications of library technology (both actual and hypothetical)
* Technology projects (failed, successful, or proposed), including how they were done and challenges faced
* Case studies
* Best practices
* Reviews
* Comparisons of third party software or libraries
* Analyses of library metadata for use with technology
* Project management and communication within the library environment
* Assessment and user studies

C4LJ strives to promote professional communication by minimizing the barriers to publication. While articles should be of a high quality, they need not follow any formal structure. Writers should aim for the middle ground between blog posts and articles in traditional refereed journals. Where appropriate, we encourage authors to submit code samples, algorithms, and pseudo-code. For more information, visit C4LJ's Article Guidelines or browse articles from the first 40 issues published on our website:

Remember, for consideration for the 41st issue, please send proposals, abstracts, or draft articles to no later than Friday, May 11, 2018.

Send in a submission. Your peers would like to hear what you are doing.

Code4Lib Journal Editorial Committee

District Dispatch: Ready to Code school library brings music, coding to students with learning disabilities

Tue, 17 Apr 2018 12:54:45 +0000

In this 2.5 minute video, see how Heritage High School (Newport News, Va.) librarian Melanie Toran and the students she works with are combining music and coding to gain computational thinking literacies.

This post is the second in a series by Libraries Ready to Code cohort participants, who will release their beta toolkit at ALA’s 2018 Annual Conference.


The post Ready to Code school library brings music, coding to students with learning disabilities appeared first on District Dispatch.

Open Knowledge Foundation: Apply Now! School of Data’s 2018 Fellowship Programme

Tue, 17 Apr 2018 10:05:27 +0000

This blog has been reposted from the School of Data blog School of Data is inviting journalists, data scientists, civil society advocates and anyone interested in advancing data literacy to apply for its 2018 Fellowship Programme, which will run from May 2018 to January 2019. 8 positions are open, 1 in each of the following countries: Bolivia, Guatemala, Ghana, Indonesia, Kenya, Malawi, Tanzania, The Philippines. The application deadline is set on Sunday, May 6th of 2018. If you would like to sponsor a fellowship, please get in touch with School of Data. Apply for the Fellowship Programme The Fellowship School of Data works to empower civil society organisations, journalists and citizens with the skills they need to use data effectively in their efforts to create more equitable and effective societies. Fellowships are nine-month placements with School of Data for data-literacy practitioners or enthusiasts. During this time, Fellows work alongside School of Data to build an individual programme that will make use of both the collective experience of School of Data’s network to help Fellows gain new skills, and the knowledge that Fellows bring along with them, be it about a topic, a community or specific data literacy challenges. Similarly to previous years, our aim with the Fellowship programme is to increase awareness of data literacy and build communities who together, can use data literacy skills to make the change they want to see in the world. The 2018 Fellowship will continue the work in the thematic approach pioneered by the 2016 class. As a result, we will be prioritising candidates who: possess experience in, and enthusiasm for, a specific area of data literacy training can demonstrate links with an organisation practising in this defined area and/or links with an established network operating in the field We are looking for engaged individuals who already have in-depth knowledge of a given sector or specific skillsets that can be applied to this year’s focus topics.. This will help Fellows get off to a running start and achieve the most during their time with School of Data: nine months fly by! Read More about the Fellowship Programme The areas of focus in 2018 We have partnered with Hivos and NRGI to work on the following themes: Procurement and data in the extractives industry (oil, mining, gas). These amazing partner organisations will provide Fellows with guidance, mentorship and expertise in their respective domains. 2018 Fellowship Positions Bolivia The Fellowship in Bolivia will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: Experience with and interest in community building, experience with the implementation of civic projects with a data or technical component, story[...]

DuraSpace News: VIVO Updates April 8 -- camp, conference, new sites, action planning

Tue, 17 Apr 2018 00:00:00 +0000

From Mike Conlon, VIVO Product Director

District Dispatch: 2018 WHCLIST award winner announced

Mon, 16 Apr 2018 16:13:52 +0000

This week, the American Library Association’s (ALA) Washington Office announced that Yolanda Peña-Mendrek of Oakley, California is the winner of the 2018 White House Conference on Library and Information Services (WHCLIST) Award. Given to a non-librarian participant attending National Library Legislative Day, the award covers hotel fees and includes a $300 stipend to defray the cost of attending the event. An active library advocate and a member of the Friends of the Oakley Library, Peña-Mendrek was appointed as the Contra Costa County Library Commissioner in 2017. In her first year as Library Commissioner, she helped raise funds to support five branch libraries serving fast growing parts of the county. A retired teacher, Peña-Mendrek is a firm believer in the importance of a good education and access to information. After she became a teacher, she got to see librarians working firsthand with the students at her school, as well as through the local library. She sees libraries as a place where people from all walks of life have the opportunity to expand their knowledge, and strongly believes that elected officials need to hear about the services libraries provide for their communities. Upon learning that she would be the recipient of the 2018 WHCLIST award, she had this to say: I feel humbled and extremely honored to receive this scholarship to be able to represent my community, and to be their voice on this National Library Legislative Day 2018. Beyond her involvement with the local libraries, Peña-Mendrek has also served a number of other organizations in her community, including the National Association of Latinos Elected and Appointed Officials, the California School Board Association, and the American Council of Teachers of Foreign Languages. Now that she is retired, Peña-Mendrek wants to put her energy to use by continuing to support her community’s access to libraries. We look forward to having her attend National Library Legislative Day 2018, where she will join other attendees from California to advocate on behalf of libraries. The White House Conference on Library and Information Services—an effective force for library advocacy nationally, statewide and locally—transferred its assets to the ALA Washington Office in 1991 after the last White House conference. These funds allow ALA to participate in fostering a spirit of committed, passionate library support in a new generation of library advocates. Leading up to National Library Legislative Day each year, the ALA seeks nominations for the award. Representatives of WHCLIST choose the recipient. The post 2018 WHCLIST award winner announced appeared first on District Dispatch.[...]