Tue, 21 Jun 2016 08:00:00 EDTAn error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.
Tue, 29 Mar 2016 08:00:00 EDTA time warp contour calculator for use in an audio signal decoder receives an encoded warp ratio information, derives a sequence of warp ratio values from the encoded warp ratio information, and obtains warp contour node values starting from a time warp contour start value. Ratios between the time warp contour node values and the time warp contour starting value are determined by the warp ratio values. The time warp contour calculator computes a time warp contour node value of a given time warp contour node, on the basis of a product-formation having a ratio between the time warp contour node values of the intermediate time warp contour node and the time warp contour starting value and a ratio between the time warp contour node values of the given time warp contour node and of the intermediate time warp contour node as factors.
Tue, 21 Jul 2015 08:00:00 EDTDisclosed herein are systems and methods for navigating electronic texts. According to an aspect, a method may include determining text subgroups within an electronic text. The method may also include selecting a text seed within one of the text subgroups. Further, the method may include determining a similarity relationship between the text seed and one or more adjacent text subgroups that do not include the selected text seed. The method may also include associating the text seed with the one or more adjacent text subgroups based on the similarity relationship to create a text cluster.
Tue, 02 Jun 2015 08:00:00 EDTAn apparatus for encoding an audio signal having a stream of audio samples has: a windower for applying a prediction coding analysis window to the stream of audio samples to obtain windowed data for a prediction analysis and for applying a transform coding analysis window to the stream of audio samples to obtain windowed data for a transform analysis, wherein the transform coding analysis window is associated with audio samples within a current frame of audio samples and with audio samples of a predefined portion of a future frame of audio samples being a transform-coding look-ahead portion, wherein the prediction coding analysis window is associated with at least the portion of the audio samples of the current frame and with audio samples of a predefined portion of the future frame being a prediction coding look-ahead portion, wherein the transform coding look-ahead portion and the prediction coding look-ahead portion are identically to each other or are different from each other by less than 20%; and an encoding processor for generating prediction coded data or for generating transform coded data.
Tue, 26 May 2015 08:00:00 EDTA multilingual electronic transfer dictionary provides for automatic topic disambiguation by including one or more topic codes in definitions contained the dictionary. Automatic topic disambiguation is accomplished by determining the frequencies of topic codes within a block of text. Dictionary entries having more frequently occurring topic codes are preferentially selected over those having less frequently occurring topic codes. When the topic codes are members of a hierarchical topical coding system, such as the International Patent Classification system, an iterative method can be used with starts with a coarser level of the coding system and is repeated at finer levels until an ambiguity is resolved. The dictionary is advantageously used for machine translation, e.g. between Japanese and English.
Tue, 26 May 2015 08:00:00 EDTA platform application and methods of operation that integrate both native and third-party modules into an integrated environment on an inmate computing device is disclosed. Third-party modules or systems are applications meant to operate independent from the platform application. Information is communicated between the platform application and third-party module or system to add audit, alarm and other functions across all modules or systems controlled by the platform software. The third-party module or system is audited to allow triggering of rules that cause remedial action to be taken. Triggers can be on actions not monitored by a particular third-party module or system.
Tue, 26 May 2015 08:00:00 EDTIn one example, a method includes receiving an indication of an input gesture detected at a presence-sensitive input device, where the input gesture includes one or more input points and each input point is detected at a respective location of the presence-sensitive input device. The method may also include determining a focal point of the input gesture, and determining a radius length. The method may also include determining a shape centered at the focal point and having a size determined based on the radius length. The method may also include responding to a change in a geometric property of the shape by scaling information included in a graphical user interface, where the scaling of the information being centered at the focal point.
Tue, 26 May 2015 08:00:00 EDTVarious embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.
Tue, 26 May 2015 08:00:00 EDTAn audio signal decoder has a time warp contour calculator, a time warp contour data rescaler and a warp decoder. The time warp contour calculator is configured to generate time warp contour data repeatedly restarting from a predetermined time warp contour start value, based on time warp contour evolution information describing a temporal evolution of the time warp contour. The time warp contour data rescaler is configured to rescale at least a portion of the time warp contour data such that a discontinuity at a restart is avoided, reduced or eliminated in a rescaled version of the time warp contour. The warp decoder is configured to provide the decoded audio signal representation, based on an encoded audio signal representation and using the rescaled version of the time warp contour.
Tue, 26 May 2015 08:00:00 EDTAn audio encoder for encoding an audio signal has a first coding branch, the first coding branch comprising a first converter for converting a signal from a time domain into a frequency domain. Furthermore, the audio encoder has a second coding branch comprising a second time/frequency converter. Additionally, a signal analyzer for analyzing the audio signal is provided. The signal analyzer, on the hand, determines whether an audio portion is effective in the encoder output signal as a first encoded signal from the first encoding branch or as a second encoded signal from a second encoding branch. On the other hand, the signal analyzer determines a time/frequency resolution to be applied by the converters when generating the encoded signals. An output interface includes, in addition to the first encoded signal and the second encoded signal, a resolution information identifying the resolution used by the first time/frequency converter and used by the second time/frequency converter.
Tue, 26 May 2015 08:00:00 EDTA method of signal processing according to one embodiment includes calculating an envelope of a first signal that is based on a low-frequency portion of a speech signal, calculating an envelope of a second signal that is based on a high-frequency portion of the speech signal, and calculating a plurality of gain factor values according to a time-varying relation between the envelopes of the first and second signal. The method includes attenuating, based on a variation over time of a relation between the envelopes of the first and second signals, at least one of the plurality of gain factor values. In one example, the variation over time of a relation between the envelopes is indicated by at least one distance among the plurality of gain factor values.
Tue, 26 May 2015 08:00:00 EDTA speech recognition method including the steps of receiving a speech input from a known speaker of a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model has a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation and has been trained using first training data and adapted using second training data to said speaker. The speech recognition method also determines the likelihood of a sequence of observations occurring in a given language using a language model and combines the likelihoods determined by the acoustic model and the language model and outputs a sequence of words identified from said speech input signal. The acoustic model is context based for the speaker, the context based information being contained in the model using a plurality of decision trees and the structure of the decision trees is based on second training data.
Tue, 26 May 2015 08:00:00 EDTA messaging response system is disclosed wherein a service providing system provides services to users via messaging communications. In accordance with an exemplary embodiment of the present invention, multiple respondents servicing users through messaging communications may appear to simultaneously use a common “screen name” identifier.
Tue, 26 May 2015 08:00:00 EDTIn a mobile device, a bone conduction or vibration sensor is used to detect the user's speech and the resulting output is used as the source for a low power Voice Trigger (VT) circuit that can activate the Automatic Speech Recognition (ASR) of the host device. This invention is applicable to mobile devices such as wearable computers with head mounted displays, mobile phones and wireless headsets and headphones which use speech recognition for the entering of input commands and control. The speech sensor can be a bone conduction microphone used to detect sound vibrations in the skull, or a vibration sensor, used to detect sound pressure vibrations from the user's speech. This VT circuit can be independent of any audio components of the host device and can therefore be designed to consume ultra-low power. Hence, this VT circuit can be active when the host device is in a sleeping state and can be used to wake the host device on detection of speech from the user. This VT circuit will be resistant to outside noise and react solely to the user's voice.
Tue, 26 May 2015 08:00:00 EDTA biometric voice command and control switching device has a microphone assembly for receiving a currently spoken challenge utterance and a reference utterance, and a voice processing circuit for creating electronic signals indicative thereof. The device further includes a memory for storing the electronic signals, and a processor for comparing the electronic signals to determine if there is a match. If there is a match, an interface circuit enables the operable control of the controlled device.
Tue, 26 May 2015 08:00:00 EDTThis device 301 stores a first content-specific language model representing a probability that a specific word appears in a word sequence representing a first content, and a second content-specific language model representing a probability that the specific word appears in a word sequence representing a second content. Based on a first probability parameter representing a probability that a content represented by a target word sequence included in a speech recognition hypothesis generated by a speech recognition process of recognizing a word sequence corresponding to a speech, a second probability parameter representing a probability that the content represented by the target word sequence is a second content, the first content-specific language model and the second content-specific language model, the device creates a language model representing a probability that the specific word appears in a word sequence corresponding to a part corresponding to the target word sequence of the speech.
Tue, 26 May 2015 08:00:00 EDTA speech recognition system, method of recognizing speech and a computer program product therefor. A client device identified with a context for an associated user selectively streams audio to a provider computer, e.g., a cloud computer. Speech recognition receives streaming audio, maps utterances to specific textual candidates and determines a likelihood of a correct match for each mapped textual candidate. A context model selectively winnows candidate to resolve recognition ambiguity according to context whenever multiple textual candidates are recognized as potential matches for the same mapped utterance. Matches are used to update the context model, which may be used for multiple users in the same context.
Tue, 26 May 2015 08:00:00 EDTThe present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.
Tue, 26 May 2015 08:00:00 EDTA system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. Certain embodiments of the system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.
Tue, 26 May 2015 08:00:00 EDTMethods, systems, and apparatus, including computer programs encoded on computer storage media, for speech recognition. One of the methods includes receiving a base language model for speech recognition including a first word sequence having a base probability value; receiving a voice search query associated with a query context; determining that a customized language model is to be used when the query context satisfies one or more criteria associated with the customized language model; obtaining the customized language model, the customized language model including the first word sequence having an adjusted probability value being the base probability value adjusted according to the query context; and converting the voice search query to a text search query based on one or more probabilities, each of the probabilities corresponding to a word sequence in a group of one or more word sequences, the group including the first word sequence having the adjusted probability value.
Tue, 26 May 2015 08:00:00 EDTSome embodiments of the inventive subject matter include a method for detecting speech loss and supplying appropriate recollection data to the user. Such embodiments include detecting a speech stream from a user, converting the speech stream to text, storing the text, detecting an interruption to the speech stream, wherein the interruption to the speech stream indicates speech loss by the user, searching a catalog using the text as a search parameter to find relevant catalog data and, presenting the relevant catalog data to remind the user about the speech stream.
Tue, 26 May 2015 08:00:00 EDTAn encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.
Tue, 26 May 2015 08:00:00 EDTAn apparatus for decoding data segments representing a time-domain data stream, a data segment being encoded in the time domain or in the frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples. The apparatus includes a time-domain decoder for decoding a data segment being encoded in the time domain and a processor for processing the data segment being encoded in the frequency domain and output data of the time-domain decoder to obtain overlapping time-domain data blocks. The apparatus further includes an overlap/add-combiner for combining the overlapping time-domain data blocks to obtain a decoded data segment of the time-domain data stream.
Tue, 26 May 2015 08:00:00 EDTA method (700, 800) and apparatus (100, 200) processes audio frames to transition between different codecs. The method can include producing (720), using a first coding method, a first frame of coded output audio samples by coding a first audio frame in a sequence of frames. The method can include forming (730) an overlap-add portion of the first frame using the first coding method. The method can include generating (740) a combination first frame of coded audio samples based on combining the first frame of coded output audio samples with the overlap-add portion of the first frame. The method can include initializing (760) a state of a second coding method based on the combination first frame of coded audio samples. The method can include constructing (770) an output signal based on the initialized state of the second coding method.
Tue, 26 May 2015 08:00:00 EDTThe present invention is based on the finding that parameters including: a first set of parameters of a representation of a first portion of an original signal and a second set of parameters of a representation of a second portion of the original signal can be efficiently encoded when the parameters are arranged in a first sequence of tuples and a second sequence of tuples. The first sequence of tuples includes tuples of parameters having two parameters from a single portion of the original signal and the second sequence of tuples includes tuples of parameters having one parameter from the first portion and one parameter from the second portion of the original signal. A bit estimator estimates the number of necessary bits to encode the first and the second sequence of tuples. Only the sequence of tuples, which results in the lower number of bits, is encoded.
Tue, 26 May 2015 08:00:00 EDTMethods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating search results. In one aspect, a method includes obtaining a transcription of a voice query, and data that identifies an accent of the voice query, submitting the transcription and the data that identifies the accent of the voice query to a search engine to generate one or more accent-influenced results of the voice query, and providing the accent-influenced results to a client device for display.
Tue, 26 May 2015 08:00:00 EDTMethods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatic text suggestion are described. One of the methods includes receiving a text item including one or more terms; determining a plurality of text strings, each text string including a matching portion and one or more suffixes, wherein the matching portion matches the text item, and the one or more suffixes are located after the matching portion; ranking the one or more suffixes based on a credibility score and a frequency score of each suffix, the credibility score indicating an estimated credibility of a source of the text string including the suffix, the frequency score indicating an estimated frequency of appearance of the suffix; and providing a group of the one or more suffixes that includes a highest ranking suffix for display as a suggestion for completing a sentence starting from the text item.
Tue, 26 May 2015 08:00:00 EDTMethods, systems, and apparatus, including computer program products, for extracting information from unstructured text. Fact pairs are used to extract basic patterns from a body of text. Patterns are generalized by replacing words with classes of similar words. Generalized patterns are used to extract further fact pairs from the body of text. The process can begin with fact pairs, basic patterns, or generalized patterns.
Tue, 26 May 2015 08:00:00 EDTComputer-implemented systems and methods are provided for suggesting emoticons for insertion into text based on an analysis of sentiment in the text. An example method includes: determining a first sentiment of text in a text field; selecting first text from the text field in proximity to a current position of an input cursor in the text field; identifying one or more candidate emoticons wherein each candidate emoticon is associated with a respective score indicating relevance to the first text and the first sentiment based on, at least, historical user selections of emoticons for insertion in proximity to respective second text having a respective second sentiment; providing one or more candidate emoticons having respective highest scores for user selection; and receiving user selection of one or more of the provided emoticons and inserting the selected emoticons into the text field at the current position of the input cursor.
Tue, 26 May 2015 08:00:00 EDTA system to teach phonemic awareness uses a plurality of phonemes and a plurality of graphemes. Each phoneme is a unique sound and an indivisible unit of sound in a spoken language, and each grapheme is a written representation of one of the plurality of phonemes. A plurality of distinct graphical images and a plurality of unique names are provided where each unique name is associated with one of the graphical images and represents a grouping of graphemes selected from the plurality of graphemes. The system uses a plurality of sets of display pieces having a plurality of individual display pieces. Each individual display piece includes at least a portion of one of the graphical images and the graphemes from the grouping of graphemes constituting the associated unique name. A predefined instructional environment defines a predefined spatial context and predefined rules governing the acquisition and utilization of individual display pieces.
Tue, 26 May 2015 08:00:00 EDTMethods and apparatus to generate and use content-aware watermarks are disclosed herein. In a disclosed example method, media composition data is received and at least one word present in an audio track of the media composition data is selected. The word is then located in a watermark.
Tue, 26 May 2015 08:00:00 EDTAccording to one embodiment, an audio controlling apparatus includes a first receiver configured to receive audio signal, a second receiver configured to receive environmental sound, a temporary gain calculator configured to calculate temporary gain based on environmental sound received by second receiver, a sound type determination module configured to determine sound type of main component of audio signal received by first receiver, and a gain controller configured to stabilize temporary gain that is calculated by temporary gain calculator and set gain, when it is determined that sound type of main component of audio signal received by first receiver is music.
Tue, 26 May 2015 08:00:00 EDTA method comprising: sampling received audio at a first rate to produce a first audio signal; transforming the first audio signal into a sparse domain to produce a sparse audio signal; re-sampling of the sparse audio signal to produce a re-sampled sparse audio signal; and providing the re-sampled sparse audio signal, wherein bandwidth required for accurate audio reproduction is removed but bandwidth required for spatial audio encoding is retained AND/OR a method comprising: receiving a first sparse audio signal for a first channel; receiving a second sparse audio signal for a second channel; and processing the first sparse audio signal and the second sparse audio signal to produce one or more inter-channel spatial audio parameters.
Tue, 26 May 2015 08:00:00 EDTAn apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving a downmix signal and side information; extracting control restriction information from the side information; receiving control information for controlling gain or panning at least one object signal; generating at least one of first multi-channel information and first downmix processing information based on the control information and object information, without using the control restriction information; and, generating an output signal by applying the at least one of the first multichannel information and the first downmix processing information to the downmix signal, wherein the control restriction information relates to a parameter indicating limiting degree of the control information.
Tue, 26 May 2015 08:00:00 EDTA decoding apparatus (10) is disclosed which includes: a storing means (11) for storing encoded audio signals including multi-channel audio signals; a transforming means (40) for transforming the encoded audio signals to generate transform block-based audio signals in a time domain; a window processing means (41) for multiplying the transform block-based audio signals by a product of a mixture ratio of the audio signals and a first window function, the product being a second window function; a synthesizing means (43) for overlapping the multiplied transform block-based audio signals to synthesize audio signals of respective channels; and a mixing means (14) for mixing audio signals of the respective channels between the channels to generate a downmixed audio signal. Furthermore, an encoding apparatus is also disclosed which downmixes the multi-channel audio signals, encodes the downmixed audio signals, and generates the encoded, downmixed audio signals.
Tue, 26 May 2015 08:00:00 EDTAn apparatus and an article of manufacture for controlling a voice site using a haptic input modality include validating a haptic input from an instrument capable of accessing a voice site, processing the haptic input on a server to determine a voice site command corresponding to the haptic input, and processing the voice site command at the server to control an interaction with the voice site.
Tue, 19 May 2015 08:00:00 EDTAn embodiment of an analysis filterbank for filtering a plurality of time domain input frames, wherein an input frame comprises a number of ordered input samples, comprises a windower configured to generate a plurality of windowed frames, wherein a windowed frame comprises a plurality of windowed samples, wherein the windower is configured to process the plurality of input frames in an overlapping manner using a sample advance value, wherein the sample advance value is less than the number of ordered input samples of an input frame divided by two, and a time/frequency converter configured to provide an output frame comprising a number of output values, wherein an output frame is a spectral representation of a windowed frame.
Tue, 19 May 2015 08:00:00 EDTMethods, systems, and techniques for keyword management are described. Some embodiments provide a keyword management system (“KMS”) configured to determine the effectiveness of multiple candidate keywords. In some embodiments, the KMS generates multiple candidate keywords based on an initial keyword. The KMS may then determine an effectiveness score for each of the candidate keywords, based on marketing information about those keywords. Next, the KMS may process the candidate keywords according to the determined effectiveness scores. In some embodiments, processing the candidate keywords includes applying rules that conditionally perform actions with respect to the candidate keywords, such as modifying advertising expenditures, modifying content, or the like.
Tue, 19 May 2015 08:00:00 EDTA computer-readable, non-transitory medium storing a character string comparison program is provided. The program causes, when executed by a computer, the computer to perform a process including splitting a first character string and a second character string into words; acquiring information including a semantic attribute that represents a semantic nature of each of the words and a conceptual code that semantically identifies said each of the words, from a storage device; identifying a pair of the words having a common semantic attribute between the first character string and the second character string; comparing the conceptual codes of the specified pair of the words between the first character string and the second character string; and generating a comparison result between the first character string and the second character string based upon a comparison result of the conceptual codes.
Tue, 19 May 2015 08:00:00 EDTLow bit rate audio coding such as BWE algorithm often encounters conflict goal of achieving high time resolution and high frequency resolution at the same time. In order to achieve best possible quality, input signal can be first classified into fast signal and slow signal. This invention focuses on classifying signal into fast signal and slow signal, based on at least one of the following parameters or a combination of the following parameters: spectral sharpness, temporal sharpness, pitch correlation (pitch gain), and/or spectral envelope variation. This classification information can help to choose different BWE algorithms, different coding algorithms, and different postprocessing algorithms respectively for fast signal and slow signal.
Tue, 19 May 2015 08:00:00 EDTA device may include a physical phenomenon detector. The physical phenomenon detector may detect a physical phenomenon related to the device. In response to detecting the physical phenomenon, the device may record audio data that includes speech. The speech may be transcribed with a speech recognition engine. The speech recognition engine may be included in the device, or may be included with a remote computing device with which the device may communicate.
Tue, 19 May 2015 08:00:00 EDTCurrent human-to-machine interfaces enable users to interact with a company's database and enter into a series of transactions (e.g., purchasing products/services and paying bills). Each transaction may require several operations or stages requiring user input or interaction. Some systems enable a user to enter a voice input parameter providing multiple operations of instruction (e.g., single natural language command). However, users of such a system do not know what types of commands the system is capable of accepting. Embodiments of the present invention facilitate communications for user transactions by determining a user's goal transaction and presenting a visual representation of a voice input parameter for the goal transaction. The use of visual representations notifies the user of the system's capability of accepting single natural language commands and the types of commands the system is capable of accepting, thereby enabling a user to complete a transaction in a shorter period of time.
Tue, 19 May 2015 08:00:00 EDTAn image processing apparatus including: image processor which processes broadcasting signal, to display image based on processed broadcasting signal; communication unit which is connected to a server; a voice input unit which receives a user's speech; a voice processor which processes a performance of a preset corresponding operation according to a voice command corresponding to the speech; and a controller which processes the voice command corresponding to the speech through one of the voice processor and the server if the speech is input through the voice input unit. If the voice command includes a keyword relating to a call sign of a broadcasting channel, the controller controls one of the voice processor and the server to select a recommended call sign corresponding to the keyword according to a predetermined selection condition, and performs a corresponding operation under the voice command with respect to the broadcasting channel of the recommended call sign.
Tue, 19 May 2015 08:00:00 EDTApparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
Tue, 19 May 2015 08:00:00 EDTAn apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.
Tue, 19 May 2015 08:00:00 EDTMethods, apparatus, and computer programs for simulating the source of sound are provided. One method includes operations for determining a location in space of the head of a user utilizing face recognition of images of the user. Further, the method includes an operation for determining a sound for two speakers, and an operation for determining an emanating location in space for the sound, each speaker being associated with one ear of the user. The acoustic signals for each speaker are established based on the location in space of the head, the sound, the emanating location in space, and the auditory characteristics of the user. In addition, the acoustic signals are transmitted to the two speakers. When the acoustic signals are played by the two speakers, the acoustic signals simulate that the sound originated at the emanating location in space.
Tue, 19 May 2015 08:00:00 EDTA method of complementing a spoken text. The method including receiving text data representative of a natural language text, receiving effect control data including at least one effect control record, each effect control record being associated with a respective location in the natural language text, receiving a stream of audio data, analyzing the stream of audio data for natural language utterances that correlate with the natural language text at a respective one of the locations, and outputting, in response to a determination by the analyzing that a natural language utterance in the stream of audio data correlates with a respective one of the locations, at least one effect control signal based on the effect control record associated with the respective location.
Tue, 19 May 2015 08:00:00 EDTMethods, systems, and computer program products are provided for email administration for rendering email on a digital audio player. Embodiments include retrieving an email message; extracting text from the email message; creating a media file; and storing the extracted text of the email message as metadata associated with the media file. Embodiments may also include storing the media file on a digital audio player and displaying the metadata describing the media file, the metadata containing the extracted text of the email message.
Tue, 19 May 2015 08:00:00 EDTA method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.
Tue, 19 May 2015 08:00:00 EDTMethods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.